“Choose a dev and test set to reflect data you expect to get in the future and consider important to do well on.”
Could anyone please explain this above line quoted by Prof. Andrew?
Hi @Rhythm_Dutta,
What is the name of the video that the sentence comes from? It is not the best to discuss someone’s words without knowing the context.
Thank you.
Raymond
This line was discussed in the video “Train/Dev/Test Distributions” from Week1 of Course 3.
Thanks, @Rhythm_Dutta. The idea of that part of the lecture is that, if our dev/test set was only reflecting medium income zip codes, since we used the dev set to tune our model, our model would only be likely to work well for the medium income zip codes. We need to note that the dev set plays such a crucial role here.
It is, therefore, not going to do well on low income zip codes because we supposed the data for it would behave quite differently from the medium income zip codes.
In contrast, if we had wanted the model to work well for low income zip codes, then we should have taken that kind of samples into our dev/test set, such that when tuning our model with the dev set, low income zip codes would have been taken into account - in other words, our tuned model had to work well on low income zip codes as well!
This is why the lecture reminded to make sure the dev/test set to reflect whatever we want our model to do well on.
Cheers,
Raymond