Calc variance in batch normalization

ckim · April 15, 2021, 9:17am

when estimating the variance sigma, batch normalization uses

the mean mu=1/m*sum(x_i)

the variance sigma^2=1/m*sum(x_i-mu)

but the formula for estimating the variance is

sigma^2=1/(m-1)*sum(x_i-mu)

as the variance is biased. Why does batch normalization divide by m and not by m-1 here?

jonaslalin · April 15, 2021, 1:59pm

Great question @ckim!

From the batchnorm paper, the unbiased variance estimate is actually used during inference:

Screen Shot 2021-04-15 at 15.50.23

For completeness, the batchnorm algorithm is summarized by

and

Screen Shot 2021-04-15 at 15.52.12

However, as you have just pointed out, it has been debated before if it is better to use unbiased variance estimates during training as well:

ckim · April 17, 2021, 7:08am

Topic		Replies	Views
Question about computing variance for normalization Improving Deep Neural Networks: Hyperparameter tun coursera-platform	4	572	June 24, 2021
Batch Normalization Improving Deep Neural Networks: Hyperparameter tun coursera-platform	3	565	May 31, 2021
A question about batch normalization Improving Deep Neural Networks: Hyperparameter tun coursera-platform	3	402	September 5, 2023
Batch Normalization with Stochastic Gradient Descent Improving Deep Neural Networks: Hyperparameter tun coursera-platform	1	554	February 27, 2022
C2W3 Quiz Which of the following are true about batch normalization Improving Deep Neural Networks: Hyperparameter tun coursera-platform	2	972	March 3, 2023

ok thx