Course 4 week 1 assignment 2 exercise 1: batch normalization parameters

Suhail_Amiri · September 26, 2021, 3:04am

In the happyModel, the code shows that there are 128 parameters for the batchnorm step. Shouldn’t it be 64 instead? 32 B’s and 32 gamma’s since we use 32 filters in Conv2D?

jonaslalin · September 28, 2021, 6:54am

Great question!

The following illustration is borrowed from Group Normalization by Yuxin Wu and Kaiming He:

From Prof Andrew Ng’s lecture, we know that

As Yuxin Wu and Kaiming He’s illustration shows, axis=-1 means that we calculate the mean and standard deviation using all pixel values (width x height) over the entire batch for each channel.

In the Happy Model example, we have 32 channels, so we get 32 gammas and 32 betas. We also calculate 32 means and 32 standard deviations.

Running the example isolated in TensorFlow:

Consequently, the 64 trainable params are the gammas and betas, whereas the 64 non-trainable params are the means and standard deviations. In total, we have 128 parameters for the batch normalization layer.

Suhail_Amiri · September 28, 2021, 7:13pm

Great thank you! This is a very clear explanation.

Topic		Replies	Views
C4 W1 Lab2 Convolutional Neural Networks Convolutional Neural Networks coursera-platform	6	465	September 16, 2023
# of parameters in BatchNorm layer? Convolutional Neural Networks coursera-platform	2	394	August 7, 2023
Regarding batch normalization Convolutional Neural Networks coursera-platform	6	647	August 16, 2021
Week 1 - Programming assignment - BatchNorm Convolutional Neural Networks coursera-platform	2	466	May 6, 2023
Keras - number of parameters in BatchNorm Layer Convolutional Neural Networks coursera-platform	4	1116	April 30, 2021

Course 4 week 1 assignment 2 exercise 1: batch normalization parameters

Related topics