BatchNormal(epsilon=1e-06, momentum=0.9, axis=0, beta_init=’zero’, gamma_init=’one’)¶
Batch normalization layer (Ioffe and Szegedy, 2014) [R11] .
Normalize the activations of the previous layer at each batch, i.e. applies a transformation that maintains the mean activation close to 0 and the activation standard deviation close to 1.
epsilon ： small float > 0
Fuzz parameter. npdl expects epsilon >= 1e-5.
axis : integer
axis along which to normalize in mode 0. For instance, if your input tensor has shape (samples, channels, rows, cols), set axis to 1 to normalize per feature map (channels axis).
momentum : float
momentum in the computation of the exponential average of the mean and standard deviation of the data, for feature-wise normalization.
beta_init : npdl.initializations.Initializer
name of initialization function for shift parameter, or alternatively, npdl function to use for weights initialization.
gamma_init : npdl.initializations.Initializer
name of initialization function for scale parameter, or alternatively, npdl function to use for weights initialization.
# Input shape
Arbitrary. Use the keyword argument input_shape (tuple of integers, does not include the samples axis) when using this layer as the first layer in a model.
# Output shape
Same shape as input.
[R11] (1, 2) [Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift](https://arxiv.org/abs/1502.03167)