What is the use of the test set if the dev and test set come from the same source?
Hey @Kukuquack,
I believe you can see a detailed discussion on the same topic here.
Since the dev set is used to determine the various hyperparameters of your model, it is completely possible for these hyper-parameters to fit (or overfit) to the dev set, and hence, test set provides new unseen examples on which you can test your model to get a better real-world like prediction of your model’s performance.
Here, you might get confused that the model doesn’t train on the dev set, so, how it can overfit on the dev set? Let’s say you keep on changing a particular hyper-parameter until your model performs good on the dev set. In this case, even without training on the dev set, the model was able to find the value of the hyper-parameter that your dev set likes, so we have achieved the objective of training our model with the hyper-parameters that are more suitable to the real-world data that the model will be facing after being deployed, but at the cost of exposing our dev set to the model. So, we need a new set, a set with unseen examples to find the true performance of our model, which is none other than the test set.
I can keep on going with many other examples, but I guess you get the gist of this. I hope this helps.
Regards,
Elemento
The model uses the dev set to be tuned, you can say that it learns from it, so its seen data. The test data is completely unseen.
You are amazing! I didn’t understand this concept every time I took a ML course, but now I did. Thank you very much!
I am glad I could help. Do check out the mentioned thread as well. It may help you to understand the same concept from a different perspective.
Regards,
Elemento
I have a doubt here… if this is the reason why we use a test set in training, then it is possible that even the test set might be overfitted right…?
Hey @Gokul_R1,
Welcome, and we are glad that you could become a part of our community
If we don’t use dev set, and just use the test set to hyper-tune our model, then yes, it’s definitely possible for the model to overfit on the test set. And that is the exact reason as to why we use dev set.
Now, let’s assume that we created all the 3 sets, train, dev and test. However, we repeated the cycle of “Training model on train set + Hypertune the model using dev set + Test the model using test set” a huge number of times. In that case, the model can even end up over-fitting the test set. Cause at the end, we would keep the hyper-parameters which are giving the best performance on the test set.
In this case, we would need yet another test set, which hasn’t been seen by the model, so that we can judge the model’s performance prior to deploying it in the real world.
I hope this helps, let us know if you have any further queries.
Cheers,
Elemento