In DLS/C3/W1 Avoidable Bias refers to the difference between Train set error and Bayes error. However, the difference between Dev set error and Train set error is called variance. In previous videos variance increased when you over-fitted which Andrew describes as happening only if your Train set error is smaller than Bayes error. My doubts are:
- If Train set error and Bayes error are close, but Dev set error is far from Train set error. Am I not already over-fitting? The only solution that I see to this is having a bit more error in Train set to decrease error in Dev set. How can Train set error and Bayes error remain close (small avoidable bias) and decrease Dev set error (variance) only?
- I’m familiar with techniques to decrease variance such as regularization, increasing the test set size, early stopping, changing network topology, etc. Don’t all of this affect (increase) avoidable bias too?
For instance my understanding of regularization is that you force the coefficients of high order to have a very small (negligible) value by forcing the system to minimize the cost function + added regularization weights. In this way your model will fit less well to train set data (you have less high order polynomials terms to fit complicated shapes) and you will avoid over-fitting, i.e. you will generalize better on unseen data. By doing this, aren’t you getting further away from Bayes error level? In the end if we assume the humans are at Bayes level, humans could identify very well how to fit a line that separates wanted from unwanted data no matter how “high order polynomials” our brains need to use