Questions on batch normalization

iterentyev · September 21, 2023, 1:38am

I have a few questions on input data and batch normalization:

With respect to input data normalization:
Is entire input data normalized (subtract mean, divide by variance) once when doing mini batch optimization, or can each mini-batch input be normalized separately?
With respect to mini batch normalization (and making \beta^{[l]}, \gamma^{[l]} parameters to be optimized together with W^{[l]}), Prof Andrew talks about normalization for hidden layers (normalizing z^{[1]}, z^{[2]}, \ldots.
Can same be done for z^{[0]} , which is same as mini batch input X and optimize for \beta^{[0]}, \gamma^{[0]} as well?

balaji.ambresh · September 21, 2023, 5:46am

Each mini batch of data into the batchnorm layer is standardized as it arrives. Remember that \beta and \gamma are learnt along the way during training. See this as well.
You can technically use a batch norm layer to standardize your data by placing it before any processing layers. I’ve not seen anyone do it instead of preprocessing explicitly.

iterentyev · September 27, 2023, 7:49pm

In Question 1, I am trying to understand if input data X is normalized once (as preprocessing) or each mini-batch input X^{i} is normalized (as preprocessing).

balaji.ambresh · September 27, 2023, 7:58pm

Each mini-batch of data is normalized on the fly (separately) during training, while simultaneously learning the parameters. At test time, the parameters are fixed from the perspective of normalization.

If this is confusing, please watch the lectures. Andrew does a really nice job in providing an overview of the batch normalization process.

Topic		Replies	Views
Confusion with Input normalization and batch normalization Improving Deep Neural Networks: Hyperparameter tun	3	603	January 22, 2022
Understanding BatchNormalization Convolutional Neural Networks	8	1408	June 6, 2021
Batch Normalization vs Feature Input Normalization Improving Deep Neural Networks: Hyperparameter tun	3	630	May 24, 2021
An ambiguity about batch normalization at test time Improving Deep Neural Networks: Hyperparameter tun	1	519	December 13, 2022
C2W3 quiz - understand answer to Question 8 Improving Deep Neural Networks: Hyperparameter tun	8	636	June 1, 2022

Questions on batch normalization

Related topics