When to say we are overfitting the dev set?

bedo_elsayed · October 12, 2022, 1:48pm

why here prof Andrew say you overfitting the dev set
dev and test sets are equal in error isn’t that good?

gent.spah · October 12, 2022, 2:12pm

Still though the error is quite large compared to training set. If the model performs well on train set but much worse on dev or test set that means the model is over-fitting the training data.

bedo_elsayed · October 12, 2022, 2:31pm

I think I didn’t clarify the question
I’m asking about overfitting the dev set not about the high variance in the training data
that’s what Dr. Andrew says in this slide “you overfitting the dev set”

paulinpaloalto · October 12, 2022, 2:37pm

Well, I think what he meant there is that the distance between the “dev error” and the “test error” would be overfitting of “dev” relative to “test”, if there is any such distance. But in this case there isn’t.

But with the particular performance example he gives here, you’ve got bigger problems to solve first before you get around to any putative dev overfitting. I’d say that fixing the avoidable bias is the first order of business here.

bedo_elsayed · October 12, 2022, 2:44pm

@paulinpaloalto that’s helpful thank’s
you mean that if error of test = error of dev so there is no problem with the dev and test

paulinpaloalto · October 12, 2022, 2:47pm

Yes, notice that he labels that arrow “degree of overfitting to dev set”. But in this case the difference is 0. If it were non-zero, then it would be overfitting and you’d need to do something about it. If you keep going in that lecture he will discuss possible ways to address that when it occurs. That is the whole point of this section of the course: showing all the different types of issues that can arise with training and giving guidance about how to deal with them.

Note that it’s been more than a year since I took Course 3, so my memory of what he says on this subject it not complete. But I do remember that he makes a big deal of the fact that your dev and test data needs to be from the same distribution, although the training data can be from a different distribution than “dev/test”. If the dev/test data is from the same distribution, you’d think that this particular form of overfitting would be a fairly rare case. If you see a difference there, maybe what it tells you is that perhaps your dev and test data are actually not from the same distribution or there is some way in which the split you did between dev and test was unbalanced or biased.

bedo_elsayed · October 12, 2022, 2:50pm

Yes, thank’s for your informative responses @paulinpaloalto

Topic		Replies	Views
Overfit/overtune the dev set? Structuring Machine Learning Projects coursera-platform	2	619	May 23, 2022
Test set, Bias and Variance Structuring Machine Learning Projects coursera-platform	7	620	January 3, 2023
Question about Course 2 Week 1 Quiz Improving Deep Neural Networks: Hyperparameter tun coursera-platform	2	672	January 17, 2023
Course 2 Week 1 Basic Recipe for ML Improving Deep Neural Networks: Hyperparameter tun coursera-platform	2	519	February 3, 2022
Bias or variance problem Improving Deep Neural Networks: Hyperparameter tun week-1 , coursera-platform	1	12	January 10, 2025

When to say we are overfitting the dev set?

Related topics