I have always used cross-validation for training and testing, which seems much more efficient than using a true dev set. Do you think cross-validation is just as good, or do you recommend against it? If you recommend against it, why?
Hi @michael.brent , this is a very interesting question. In the end it depends on what is your objective. With cross-validation you are training several times, leaving out each time a different part of the dataset (the test subset). At the end you probably make an average and std deviation of the metrics to conclude how good the model generalizes.
Using a single train-dev-test iteration will be faster in training (only once!) but you have the risk of being misled by the selected dev and test sets.
Here a nice picture showing cross-validation with all their iterations
Hi @michael.brent , did my answer help in any way or did I misunderstand you?
You provided some information on cross-validation but I already knew that. I use cross-validation all the time. What I was able to take away from your response is that there is no advantage to having a single dev set except for speed (fewer training runs). Is that correct?
The thrust of my question had to do with overfitting. If you iterate on an algorithm for months, even if the algorithm doesn’t train on the dev set (or if you use cross-validation), you are manually making choices that cause the algorithm to work well on the dev set and hence you have the potential to overfit to it (by human decision-making or parameter tuning). My question was whether a single dev set has any advantages with respect to CV in this regard. In particular, if I am doing CV, do I still have to have a separate dev set that is not part of the CV to avoid the overfitting of human choices to the the hold-out sets in the CV. I understood you to be saying “no”, that CV fulfills the role of the dev set and the test set will reveal whether overfitting to the dev set by human decisions occurred.
hi @michael.brent , cross-validation is a more robust validation than single dev-test set. This is the advantage. But as you mention that you have algorithm iterating for months, you might want to consider not using cross-validation during the exploratory phase to speed your training, possibly 4-5 times or more depending on your k in k-fold cross-validation.
For example, if you are currently running a training algorithm which takes 1 month with 5-fold CV, you can iterate 4-5x faster with single dev-test validation and try other algorithms with other hyperparameters in the same amount of time. When you find the right algorithm + hyperparemeters, maybe it is time to do cross-validation to re-confirm the metrics achieved. But this might not be needed at all, depending on your use-case
Hello @michael.brent , I hope we got to a common understanding or you didn’t see my previous message?
Thanks a lot and enjoy the rest of Sunday!