Batchnorm regularization

Deep Learning Specialization
week-3
Hey, I’ve watched the batch normalization video three times now, but I still don’t get why it has a regularization effect. Can you explain it to me in a simple way?

1 Like

Normalization normally reduces the range of the input data, for example the pixels of the image range with values [0, 255]. If these values are used not only you will need more memory but also the fluctuations in your descent algorithm will be big and because they are big sometimes it happens that you can overshoot the optima i.e. jump beyond it.

In the other hand if the values are small the gradient descent steps will be smaller and the chances to hit the optima better as well as smaller numbers take less computation memory!

1 Like