Bias parameter b -Fitting Batch Norm

Liviu_Marian_Mircea · August 3, 2023, 4:24am

In batch normalization why the parameter b is cancel out ?

In this post Batch Normalization : B parameter is similar with this small example: {a+k,b+k} and {a,b} have the same normalization .

But in the “Fitting Batch Norm into a Neural Network” at 8:41 why Andrew treat
b as a constant ? In a mini batch we have at a layer L Z= W*A + b, where b is a matrix right ? If we write the first row of Z we get:
[W[0]*a1+b11,W[0]*a2+b12,W[0]*a3+b13…] where W[0] is the first row of W , b’s are scalars of the first row of b and a’s are the output of the previous layer …So the idea is at component wise level it is not a constant which is added .

balaji.ambresh · August 16, 2023, 11:57am

Does this help?

import numpy as np

np.random.seed(1)

# 10 observations, each with 2 features
x = np.random.random((10, 2))

# Dense layer with 3 units
weights = np.random.random((2, 3))
biases = np.random.random((1, 3))

get_z_tilda = lambda z: (z - z.mean(axis=0)) / z.std(axis=0)

z_tilda1 = get_z_tilda(x @ weights + biases)
z_tilda2 = get_z_tilda(x @ weights)
assert np.allclose(z_tilda1, z_tilda2)

Topic		Replies	Views
Batch Normalization : B parameter Improving Deep Neural Networks: Hyperparameter tun coursera-platform	2	542	July 12, 2021
Course 2 Course 3 Quiz Clarification Improving Deep Neural Networks: Hyperparameter tun coursera-platform	2	514	April 21, 2023
Week 3 quiz - misleading question Improving Deep Neural Networks: Hyperparameter tun week-module-3 , coursera-platform	3	214	August 1, 2024
Week 3: Batch-Normalization confusion Improving Deep Neural Networks: Hyperparameter tun coursera-platform	3	616	May 29, 2022
Question about batch norm Improving Deep Neural Networks: Hyperparameter tun coursera-platform	6	574	April 26, 2023

Bias parameter b -Fitting Batch Norm

Related topics