Developer Guide for Intel® Data Analytics Acceleration Library 2019 Update 5
The forward batch normalization layer [Ioffe2015] normalizes x i 1...i p from the input X ∈ R n 1 x n 2 x ... x n p for the dimension k ∈ {1, ... p} and then scales and shifts the result of the normalization using the provided weights and biases as follows:
where the following characteristics are computed for the input X:
means
standard deviation
with variance
The weights and biases are learned, as well as the rest model parameters.
Given a p-dimensional tensor X ∈ R n 1 x n 2 x ... x n p , the problem is to compute the p-dimensional tensor Y ∈ R n 1 x n 2 x ... x n p :
where:
mean
variance
standard deviation
weights
biases
At the model training stage, along with the normalizing, the layer computes the population mean and variance using the exponential moving average with smoothing factor α ∈ [0,1] applied to the mini-batch means and variances.