Model selection question

Saim_Rehman · June 30, 2023, 9:58am

whats he problem here cant understand and how does using cross validation make it better?

rmwkwok · June 30, 2023, 10:13am

Hi @Saim_Rehman,

I have made up some cost values there

Given the four cost values, which d (among 1, 2, 3, 10) works the best?

Raymond

Saim_Rehman · July 2, 2023, 7:36am

2 should be because lowest

rmwkwok · July 2, 2023, 11:54am

Hello @Saim_Rehman,

Yes! Among the J_{test} that I have made up for our discussion, d=2 has the lowest cost and thus works the best! It is great that you were able to jump out of the slide (which assumed d=5 to be the best) and made the correct judgement based on your knowledge! It is very important that we understand the rationale behind rather than just sticking to the slides.

Cheers,
Raymond

Saim_Rehman · July 3, 2023, 10:35am

my qn was why is j test a problem as andrew mentioned in the image attached and how cv test is better

rmwkwok · July 3, 2023, 10:56am

Alright. I have just watched the video again, so you were asking about the video starting from roughly 1:45.

The problem is that you choose the best model based on J_{test} and you report J_{test} as the generalization error. Your model comprises two parts: trainable parameters and hyperparameters. Basically you need both to be set just right in order to say “I have trained a model”. What the training set does for you is to get you the best set of trainable parameters, however, it does not tune the hyperparameters for you. d is a hyperparameter here. To choose the best d, we use the test set. Therefore, in a broader sense, both the “training set” and the “test set” have become the dataset that you use to create the model, consequently, neither of them can give you a fair estimate of the generalization error because the generalization error should be estimated based on an unseen set of data. Note again that, so far, the model we create are based on both the training set and the testing set, and so the model so created must perform well on them. We need an unseen set of data to get an estimate of the generalization error.

Therefore, the correct way is to choose the best model based on J_{cv} and report J_{test} as the generalization error. You tune the trainable parameters with the training set, you choose the best hyperparameters with the cv set, and keep the test set always invisible to the process of creating the model. Once the model is created, you evaluate it with the test set as the generalization error.

Raymond

Topic		Replies	Views
C2_W3 Model selection and training/cross validation/test sets Advanced Learning Algorithms week-module-3	11	614	April 1, 2024
Model selection misunderstanding Advanced Learning Algorithms week-module-2	1	387	July 14, 2023
Lecture Q: Model selection and training/cross validation/test sets Advanced Learning Algorithms week-module-3	1	158	April 27, 2024
About cross validation an test sets Advanced Learning Algorithms week-module-3	1	465	March 12, 2023
Model Selection based on CV or Test & Diff b/w CV and Test data Advanced Learning Algorithms week-module-3	17	476	December 6, 2023

Model selection question

Related topics