Batch Norm at test time - Why addin beta and gamma

Hello all,

Can anyone please explain me why should we still adjust our Z by Beta and Gamma?

For me our W[l] and b[l] have been computed through gradient descent (that has been accelerated through Batch Norm). But once those {W[l],b[l]} have been defined, we should not compute anymore any whatsoever Z or dW or Beta in the test phase since we have our model parameters set through the training.
I’m sure I’m missing something, hence my question.

Thanks a lot guys for your kind explanation

If batch normalization was performed on the training dataset, the model parameters that have been found depend on that batch normalization.

During test time the same (or pretty similar) normalization has to happen to test dataset data in order to produce similar scale results.

1 Like

Thank you gent for this prompt response.

But does this mean that for production data, we’ll also need to normalize each and every input (image, text, whatever input type)?

1 Like

Yes.

1 Like