Why train-dev set has the same distribution of train set and not of the dev set?

quangdaist123 · August 12, 2021, 2:13am

If I combine a relatively small subset of the train set and the dev set. Is it always true that the combined set would have the same distribution of the train set?
Any example would be very appreciated. Thank you

saiman · August 12, 2021, 5:55am

No.

but train-dev is not a combination of training and dev. it is only a subset of training that is not used for training. so we can be sure that the comparison of train-dev accuracy with training data accuracy is free of data miss-match risk and can be a good measure of how much variance is there.

Topic		Replies	Views
Course 3 Week 2 Train-Dev set question Structuring Machine Learning Projects	1	528	July 9, 2022
Do we need training and dev/test data to come of the same distribution? Structuring Machine Learning Projects	2	657	May 5, 2022
Week1 quizz: very confused about train/dev/test set and when to add new data to which set Structuring Machine Learning Projects week-1	2	389	February 1, 2024
Course 3 Week 1 quiz Structuring Machine Learning Projects	1	567	June 25, 2022
Training-Dev Sets Structuring Machine Learning Projects	5	602	August 21, 2024

Why train-dev set has the same distribution of train set and not of the dev set?

Related topics