Batch Normalization vs Feature Input Normalization

I’m a little confused on the wording in the Week 3 Video Normalizing Activations in a Network.

Andrew says

So the intuition I hope you’ll take away from this is that we saw how normalizing the input features x can help learning in a neural network. And what batch norm does is it applies that normalization process not just to the input layer, but to the values even deep in some hidden layer in the neural network.

Is the takeaway here that instead of normalizing the input features before feeding to the NN we can use Batch Normalization layer to normalize not only the input features but also the hidden layer values?

Or is feature input normalization still required when using Batch Normalization?

1 Like

Hello @maortiz,

  1. Normalizing our inputs will help ensure that our network can effectively learn the parameters in the first layer.
  2. Normalizing the activations from our first layer will help ensure that our network can effectively learn the parameters in the second layer.
  3. and so forth…

In practice, people will typically normalize the value of z[l] rather than a[l]. Although sometimes, it is debated whether we should normalize before or after the activation.

Instead of using batch norm on inputs (which will only look at a specific mini batch), you can use the whole batch and make sure you have zero mean and unit variance as a preprocessing step. In the case of images, you can use a different normalization strategy and scale by 1/255 instead so that the pixel value is bound by 0 and 1.

Summary of Normalizing your data (specifically, input and batch normalization)..

1 Like

Thank you for the clarification and the additional reading.

1 Like

You are welcome. Good luck with the rest of the course :slight_smile: