Cross-validation Error vs Generalization Error

buronbrarda · August 17, 2022, 3:08pm

Hi!

I was watching the video “Model selection and training/cross-validation/test sets” when this doubt arose.

In this video, Andrew says that in order to evaluate a model we should use an extra cross-validation set (or dev-set), and then choose the model which minimizes the error for this set. However, after that, he says that we should measure the generalization error with the test-set.

My question is the following:

What happens if we find that the model that minimizes the error for the dev-set is different from the model that minimizes the error for the test-set, what model should we choose in this case?

I understand that we have to use the dev-set to choose the model to avoid having a bias (because maybe a model can predict better the values for the test-set than another), but in my opinion, this bias also appears since one model could be better to predict the values for the dev-set than another one. For me, the best solution for this problem is to use K-fold cross-validation and evaluate the models using the average of the errors that were obtained for each fold.

What do you think? Can anybody help me?

Thanks a lot!

gent.spah · August 17, 2022, 3:15pm

We should point out that all these sets come from the same distribution, so if the model performs good at one part of the entire dataset but not good at other parts still its not a good model after all.

K-fold cross-validation is definitely a good technique but mostly applicable to small datasets, in large datasets its very inefficient computationally to do the process many times therefore he is suggesting to keep dev-set out of the entire distribution and representative of that distribution to test the model.

buronbrarda · August 17, 2022, 3:30pm

Thanks!

Ok. I understand that K-fold cross-validation may not be feasible.

However, as you pointed out if the model performs badly for a part of the dataset it means that it doesn’t seem to be a good model. So, what I’m asking to myself now is: Why we should separate a dev-set from the test-set? Having a greater test-set to evaluate and choose the models should be better, shouldn’t it?

gent.spah · August 17, 2022, 3:44pm

The dev-set is the set where you tune hyper parameters of your model to make it fit better the training and dev-set consequently, you can say that the model also learns from the dev-set in some way. The test set is completely unseen from the model, now at this point we see how it performs “outside laboratory conditions”.

Arkadi_Popov · August 30, 2022, 12:23am

By this logic, why not extend the model to use more than one dev set? Say two or three or ten. Was it proven that one dev set is optimal? Curious.

rmwkwok · August 30, 2022, 3:10am

Hi @Arkadi_Popov,

In my opinion, first, data is a scarce resource, and we are only willing to give away a certain portion to build a cv set.

Second, further dividing the cv set into 10 sub-cv-set won’t give us a different outcome if we are to take the weighted average of the 10 sub-set’s metric results, when the metric is evaluated by adding up (or taking the mean of) the losses of individual samples.

If you have time, you can define a metric (e.g. mean square error), and make 20 pseudo cv set samples, giving each of them a true value and a predicted value. Calculate the metric value A for the 20 samples as a whole. Then divide the samples into 4 subsets, calculate a metric value for each subset, and then take the average of the 4 values to get measurement B, then you may compare A with B.

Raymond

Arkadi_Popov · August 31, 2022, 1:21am

Thanks Raymond. I guess you are referring to the Central Limit Theorem in your suggestion?

rmwkwok · August 31, 2022, 6:50am

Hey Arkadi,

We don’t need the CLT here, because algebra is enough to derive that A and B will be the same, not to mention that we are discussing only 4 subsets here

Raymond

Topic		Replies	Views
About cross validation an test sets Advanced Learning Algorithms week-3	1	465	March 12, 2023
What is the reason behind having test set and dev set? Advanced Learning Algorithms week-3	3	14	June 20, 2025
Cross validation? Structuring Machine Learning Projects coursera-platform	5	712	July 18, 2021
About dev and test sets Advanced Learning Algorithms week-3	3	523	March 14, 2023
Questions about automatically choosing model Advanced Learning Algorithms week-3	5	356	August 31, 2023

Cross-validation Error vs Generalization Error

Related topics