Batch normalization for any dimention input, adapted from Dandelion's BatchNorm class. The normalization is done as
x' = \gamma * \frac{(x-\mu)}{\sigma} + \beta
You can fabricate nonstandard BN variant by disabling any parameter among {\mu, \sigma, \gamma, \beta}
class BatchNorm(input_shape=None, axes='auto', eps=1e-5, alpha=0.01,
beta=0.0, gamma=1.0, mean=0.0, inv_std=1.0)
- input_shape: tuple or list of ints or tensor. Input shape of
module, including batch dimension. - axes:
or tuple of int. The axis or axes to normalize over. Ifauto
(the default), normalize over all axes except for the second: this will normalize over the minibatch dimension for dense layers, and additionally over all spatial dimensions for convolutional layers. - eps: small constant 𝜖 added to the variance before taking the square root and dividing by it, to avoid numerical problems
- alpha: coefficient for the exponential moving average of batch-wise means and standard deviations computed during training; the closer to one, the more it will depend on the last batches seen
- gamma, beta: these two parameters can be set to
to disable the controversial scale and shift as well as save computing power. According to Deep Learning Book, Section 8.7.1, disabling \gamma and \beta might reduce the expressive power of the neural network. - mean, inv_std: initial values for \mu and \frac{1}{\sigma}. These two parameters can also be set to
to diable the mean substraction and variance scaling.
attribute to switch between training mode and inference mode.
Estimate class centers by moving averaging, adapted from Dandelion's Center class
class Center(feature_dim, center_num, alpha=0.9, centers=None)
- feature_dim: feature dimension
- center_num: class center number
- center: initialization of class centers, should be in shape of
(center_num, feature_dim)
- alpha: moving averaging coefficient, the closer to one, the more it will depend on the last batches seen: C_{new} = \alpha*C_{batch} + (1-\alpha)*C_{old}
.forward(features=None, labels=None)
- features: batch features, from which the class centers will be estimated
- labels:
's corresponding class labels - return: centers estimated. Use
attribute to switch between training mode and inference mode. In training mode,features
are required for input; in inference mode these inputs will be ignored.