Hi everyone,
I’m studying Batch Normalization, but I am have some difficulty understanding how to calculate the bias and variance.
What I’ve learned so far:
- Bias ≈ (training error) − (Bayes error), where Bayes error is the irreducible error (e.g. approximated by human performance on the same task).
- Variance ≈ (validation error) − (training error).
The confusion:
- During training, we run the model in train mode:
- BatchNorm uses each mini-batch’s mean/variance
- Dropout is active
- This gives us the train-mode error that guides weight updates.
- During evaluation (dev/test), we switch to eval mode:
- BatchNorm uses the running mean/variance
- Dropout is disabled
- This gives us the eval-mode error that reflects real-world inference.
So there are two different “training errors”:
- Train-Mode Error – computed in model.train() with batch stats & dropout
- Train-Eval Error – computed in model.eval() on the training set using running stats & no dropout
My question is:
For the bias–variance decomposition, which of these two training errors should be used as “training error”?
- If I use train-mode error, it includes regularization noise (dropout/Bn fluctuations), so it seems above the true representational capacity of the model.
- If I use train-eval error, it mirrors the inference conditions and should lie closer to the model’s best achievable error (i.e. closer to Bayes error).
Intuitively, I think both bias and variance should be measured in eval mode, so that:
- Bias = train-eval error− Bayes error
- Variance = validation error − train-eval error
This way, I’m comparing “apples to apples” for both train and dev/test. But I’ve also seen advice that suggests using the train-mode error to gauge variance.
Can anyone explain :
- Which of these training errors is correct to use for bias and for variance, and
- Why the standard practice chooses one over the other?
Thanks in advance for clarifying this subtle but crucial point!