Question in the introduction video

jiahengchen · July 25, 2021, 6:17am

in the introduction video, Andrew Ng said if the model do well in the test set, but perform not well in the real app, then we need to change the cost function or change the dev set distribution.
l am confused in why we need to change the dev set not the test set? because l think the model is overfiting the test set but not generalize to the real work data.

manifest · July 30, 2021, 7:08pm

Hey @jiahengchen,

I guess that is based on the assumption that the test set is correct and close to the true data distribution. We don’t strictly require that the dev set also come from the same distribution as the test set, but to achieve a better generalization these distributions need to be close. I believe this idea Andrew was talking about in the video.

Dave_Espinosa · October 27, 2021, 5:43pm

Hello Andrei,

In the case where we have found out it is necessary to “change the dev set”, then we would be dealing with 3 different data distributions / sources (test, dev, train), right?

Thanks a lot!

manifest · October 28, 2021, 5:21pm

On the opposite, we want our dev and test sets to be close to the real-data distribution, so we’re fixing the dev set.

Topic		Replies	Views
The consequence of different distribution in train dev and test Structuring Machine Learning Projects	1	766	May 22, 2021
Do we need training and dev/test data to come of the same distribution? Structuring Machine Learning Projects	2	643	May 5, 2022
Course 3 Week 1 quiz Structuring Machine Learning Projects	1	566	June 25, 2022
Week 1: Train / Dev / Test video Improving Deep Neural Networks: Hyperparameter tun	9	371	August 8, 2024
Adding Training data which distribution differs from Dev/Test sets Structuring Machine Learning Projects	15	936	October 7, 2021

Question in the introduction video

Related topics