Hi everyone, I just watched the video about how to use cross validation set to choose the best model. I’m wondering if we use training set and cross validation set to choose the model and use test set to run the test of that model’s prediction. What is the point of test data? If the model is chosen by training set and dev set, and it performs badly on the test set, do we stick to this model or do we choose another one? So is the purpose of test set in this case only to measure how well the model is performing, but not to impact the decision of which model to use?
Maybe this article will clear your doubt.
The test set provides a final check on the performance of your completed system.
If the test set results are not good enough, then you go back to the beginning and improve the model.
Thanks! That makes it much clearer for me.
Got it! Thanks a lot for the answer!
Hi @Junxi_Li ,
It is weird that you get a good result with the validation, and a terrible result with the test. What this tells me is that may be the test dataset has a completely different distribution than the training and validation datasets.
If I were faced with this situation, I would immediately doubt my split and I would start again. I would:
-
Do some EDA in my data to make sure the dataset is healthy
-
Reload and shuffle my data
-
Create the split. In creating the split I would consider doing stratification to make sure that each split has a balanced representation of all classes (particularly in classification models), and also, depending on the case, if applicable I would use grouping to cross-validate, but again making sure the groups are within the same distribution.