Training-Dev Sets

Hey guys, So while watching the video in C3 - W2. I see that Prof. Ng says that you shuffle training set and carve out a piece for training-dev set. It also says that NN will train on training set but not in training-dev set. So what does training-dev set do? Does it do the same as dev set?

He explains all that in that same lecture where he introduces the concept. If you missed the full explanation, you should watch the lecture again. The point is that is used as a stand-in for the “dev set” in the case that the dev and test sets are from a different distribution than the training set. Note that he makes a big point of the fact that the dev and test sets need to be from the same distribution, but that the training set does not. In the case that the training set and the dev/test sets are from different distributions, you train on the training set and then use the training-dev set to assess how well the training worked and whether you have overfitting or underfitting and all that hyperparameter tuning. Once you’ve done that, then you compare the performance on the dev set. Seriously, he explained it all in the lectures. With my “mini-explanation” above in mind, please watch the relevant lectures again and I bet it will make sense this time.

Oh yes Thanks! Sorry, but i asked this question when i was watching the lecture but haven’t finished it yet. After finished watching the lecture, i started to understand and with your answer, it does make sense. Thanks for your time. Sorry…