What is the value of the generalization error estimate using the "test" set of data?

Amit_Misra1 · September 12, 2023, 1:29am

When choosing a NN architecture, I understand Andrew saying to run the data through each of 3 the models to get the weights for each potential model. Then run the CV data to see which model gives the lowest Loss(J).

Once you have a model chosen, whats the reasoning or value of estimating the generalization error using a test set?

Also, is there any role of randomizing the data several times and going through the above process with different sets of training, CV and test data to confirm which model is optimal?

lukmanaj · September 12, 2023, 4:53am

The reason the test set is used is to get an unbiased estimate. After selecting your best performing model, you get the unbiased performance by using the test set, which is unseen by all the models, including the best performing model.
If you do not really care about getting an unbiased estimate, you could just train using the test sets and test all the models using the development test and just go with the best without further testing.

Topic		Replies	Views
What is the reason behind having test set and dev set? Advanced Learning Algorithms week-module-3	3	27	June 20, 2025
C2_W3 Model selection and training/cross validation/test sets Advanced Learning Algorithms week-module-3	11	614	April 1, 2024
About cross validation an test sets Advanced Learning Algorithms week-module-3	1	465	March 12, 2023
Model selection question Advanced Learning Algorithms week-module-3	5	407	July 3, 2023
Cross-validation Error vs Generalization Error Advanced Learning Algorithms week-module-3	7	685	August 31, 2022

What is the value of the generalization error estimate using the "test" set of data?

Related topics