First learnt about regularization, I thought lambda would only be applied to some of the features (e.g. x^3, x^4, …) as our choice to penalise high degree polynomial terms only. But seems that it will be applied to all features. I haven’t finished W3 but appreciate if there is an simple explanation .
Regularization may be necessary even if we have no idea how the features were generated. The example of adding polynomial terms is just an easy first introduction.
Since we don’t know which features may cause overfitting, we regularize all of them.
1 Like
Make sense. Is there a way to partially regularize and is it even applied in practice?
No, and no.
1 Like