Batch Norm At Test Time Clarification

Hi Mentor,

Instead of using mini batches of examples at the training time, suppose if we used entire batch of dataset while training, do we still need to use exponential weighted average concept to estimate mean and variance ?

This is my understanding like since we are processing different mini batches at the training time, our mean and variance gets vary for each and every mini batches. And so we are using the concept of exponential weighted average to estimate mean and variance across the different mini batches.

Incase if we use single batch at training time, then there is no need of using exponential weighted average correct right ?

Yes, that sounds correct to me.