I have seen other topics here about how works the batch norm but I do not really see the answer of my question. So I have the following question:
I the video Andrew says that if we are in layer 2 then we will have that Z2,Z2 will change during the updates but if we use batch norm we have that their variance and mean will remain the same, which in the batch norm are the Beta[2] and gamma[2]. I do not understand this since the gamma and beta are also updated during the training which is the whole point of the batch norm. So then these mean and variance also change, maybe it can be seen that they do change but they change smoothly?