Practice-lab C2_W3 . CV TRAIN high bias(simpler model)

karra1729 · July 23, 2022, 3:12am

categorization error, training, regularized: 0.072, simple model, 0.062, complex model: 0.007
categorization error, cv, regularized: 0.066, simple model, 0.087, complex model: 0.113
The simple model is a bit better in the training set than the regularized model but it is worse in the cross-validation set.

shouldn’t regularized model have a lower training error than a simple model(equivalent to having extremely high lambda) in general?

Elemento · July 23, 2022, 5:12am

Hey @karra1729,
I guess we are clear on the fact as to why the simple model performs worse on the cv set as compared to the regularized model. Since, it is too simple a model, hence, it underfits the data, and thereby perform worse on the CV set.

Now, the question remains “Does the regularized model performs always better on the training set as compared to the simple model?”, and the answer is “not always”. This is because the simple model (in the lab) is a different model altogether (one dense layer less), and hence, you don’t know to what high value of \lambda does the simple model corresponds to? Perhaps, the simple model only corresponds to a value of \lambda = 0.05, which is still less than \lambda = 0.1 (used in the regularized model in the lab), and hence, it overfits the training data more.

The fact that a simple model can be thought of as a model corresponding to very high value of \lambda is definitely true, but firstly, you don’t know what this “high value” is, since \lambda is bounded on only one side, i.e., 0. On the other side, you can set it to high as you want. And secondly, since the simple model is altogether different, so, it may perform better on the training set in some cases.

However, if we keep the same structure for all the 3 models with different regularization values, can we get the same results as in the lab? This might be an interesting question! Give me some time, let me whip up a quick experiment to see if it can happen or not. Till then, let me know if this helps.

Cheers,
Elemento

Elemento · July 23, 2022, 6:12am

Hey @karra1729,
Please check out the Version 11 of this kernel. In this version, I have tried to use the same NN architecture with 3 different \lambda values, 0 for complex model, 0.1 for regularized model, and 1 for simple model. In this you can clearly see, that the regularized model performs better on the training set as compared to the simple model (high value of lambda). However, this is just a single experiment, you can run multiple experiments like this to validate this hypothesis. I hope this helps.

Cheers,
Elemento

karra1729 · July 23, 2022, 6:15am

yeah,i thought exactly the same.bcoz its a diff architecture. but I just wanted to confirm. thanks for taking such effort to explain in great detaIL.

Elemento · July 23, 2022, 10:05am

I am glad I could help.

Cheers,
Elemento

Topic		Replies	Views
C2_W3_Assignment - 5.1 Simple model Advanced Learning Algorithms week-module-3	2	562	July 18, 2022
Cv- training errors Advanced Learning Algorithms week-module-3	3	361	August 13, 2023
#C2W3 What if dev set error lower then train set error using Regularization Advanced Learning Algorithms week-module-3	2	67	June 18, 2024
Regularized vs simple Advanced Learning Algorithms week-module-3	7	536	July 1, 2022
Simple Model better than Complex Model? Advanced Learning Algorithms week-module-3	5	513	September 21, 2023

Practice-lab C2_W3 . CV TRAIN high bias(simpler model)

Related topics