hi,
Professor NG clearly explains the importance of the bias and variance concepts in selecting the model for linear regression problems in the week 3 of course 2 of ML specialization.
So he provides two options.
- the first is establishing a handfull of models with different degrees of polynomials say from degree 1 to 10 and then chooses the model out of the cross validation errors with the least cost.
- the second method is taking a given degree of polynomial i.e 4 for a regression problem and then choosing the âjust rightâ regularization term through tests on the Cross Validation test.
but missing point is which method should we proceed when we choose our model. Should I arbitrarily choose a degree of polynomial for my model and then find a âjust rightâ regularization term? or should I adopt the first approach to come up with the best polynomial degree that neither over fits or underfits.
(I am not even talking about complicating the problem even more with adding more than one or many features in the model. This, I believe, will be dealt with in week 3-decision trees or I need to study feature engineering by myself. )
or maybe I should first apply the first step and then establish the best polynomial degree for the model and then proceed to explore the best regularization term applying the method 2. so best of the two worlds as they say it
or alternatively, call some magic function in scikit-learn to figure out all these issues and give me the best result
Cheers friends,
Mehmet