- In the video for Basic Recipe for ML it is told that we should first check for High Bias and then High Variance. If it turns out that our model is firstly having high bias on training set we change the model as per the recommendations, but then again when we check for development set error it shows high variance. So, again we need to change the model as per recommendations.
Is it feasible in such situations to sequentially follow this process? or we should look out for training set error and dev set error at once and then apply the required changes?
- While creating the training, dev and test sets, is it necessary to draw dev and test sets both from same distributions as training set? or dev and test sets should come from same distributions, but not necessary from training set distribution. Also, how it will affect the model if training and dev are from same dist, but test set is from different dist?
Interesting discussion! Yes, for question 1) it is an iterative process involving both the training data and the dev data. At each step you have to do the analysis to figure out which problem you are trying to solve. If the dev error is much higher than the training error, that still doesn’t completely determine what to do next. E.g. if you have these accuracy numbers:
Training accuracy = 85%
Dev accuracy = 75%
Then it’s not really an “overfitting” problem yet: you still don’t have very high accuracy on the training set. So the next step here is to address what we hope is “avoidable bias” on the training set. E.g. by increasing the complexity of the model.
But if you have this accuracy:
Training accuracy = 99%
Dev accuracy = 85%
Then that is an overfitting problem and Prof Ng gave us several things to try in that case including getting more training data and/or adding regularization.
For question 2), it is important that the dev and test data be from the same distribution, but it is not technically required that the training data be from the same distribution as the dev/test data. Prof Ng will spend much more time and really “go deep” on this type of issue in Course 3. I’ve watched that more recently than I watched this section of Course 2, so I forget whether he says much about that here. Maybe the best idea is to “hold that thought” until you get to Course 3.
1 Like
Thankyou Paul.
You mean to say that firstly we will get both the training and dev errors. Then we need to decide on what actions should be taken as per the situation (overfitting/underfitting). This seems to be reasonable approach. Because changing the model just after getting training error (even if it is not much deviating from human predictions) doesn’t sounds logical.