Jtrain, Jcv, Jtest

Zolids · July 8, 2024, 4:20am

I dont understand, can someone give me a simpler intuition on why we cannot use Jtest as a measurement of how a model is doing? Is it because, if we only focus Jtest to be small, then it will overfit to only the test dataset?

Furthermore, it is mentioned that to pick the best model, we need to have a low Jtrain and Jcv. Why dont we just use the lowest Jtrain and Jtest?

Alireza_Saei · July 8, 2024, 7:12am

Hi @Zolids

If you tune your model to minimize J_{\text{test}} , you risk overfitting to the test set. The test set is meant to be a final, unbiased evaluation of your model’s performance on unseen data, not a part of the training or model selection process.

Th reason we use low J_{\text{train}} and J_{\text{cv}} is because J_{\text{train}} ensures the model fits the training data well and J_{\text{cv}} gives an estimate of the model’s performance on unseen data (to avoid overfitting).

Hope it helps! Feel free to ask if you nees further assistance.

Zolids · July 8, 2024, 7:29am

So basically Jcv and Jtrain data should come from the same distribution, however Jcv is only used to check which models is best to use, and Jtest is only for the final evaluation?

Do we fit the parameter using Jcv too?
Thanks for answering anyways

Alireza_Saei · July 8, 2024, 8:48am

Yes, J_{\text{train}} and J_{\text{cv}} should come from the same distribution, with J_{\text{cv}} used for model selection and J_{\text{test}} reserved for final evaluation. We fit the model parameters using the training set and use CV to select the best model by comparing J_{\text{cv}} across different models. The test set is only used once at the end to provide an unbiased estimate of the model’s performance on unseen data.

Topic		Replies	Views
Model selection misunderstanding Advanced Learning Algorithms week-2	1	387	July 14, 2023
Bias, variance diagnostic Advanced Learning Algorithms week-3	2	492	November 10, 2022
Model selection Advanced Learning Algorithms week-3	3	513	November 27, 2022
Model selection question Advanced Learning Algorithms week-3	5	407	July 3, 2023
Should I train cross validation and test data after training the model with training data? Advanced Learning Algorithms week-3	3	483	March 20, 2023

Jtrain, Jcv, Jtest

Related topics