Polynomial Regression - choosing degree of the non linearity

Hi,
I just saw the class of polynomial regression. In a multiple feature problem
f_{\overrightarrow{\mathrm{w}}, b}(\overrightarrow{\mathrm{x}})=\overrightarrow{\mathrm{w}} \cdot \overrightarrow{\mathrm{x}}+b

How do we choose the degree of the nonlinearity of a feature x_i ? Do we plot each and intuitively check for a fit to the y=x^n, and choose a n ?

In a word: Experimentation.

You want to add just enough complexity to get “good enough” results, without making the hypothesis so complex that it overfits the data set.

There is more material about this issue in later lectures.

Typically you won’t be plotting the hypothesis and assessing it by eye, because you can’t make an effective plot if there are more than 3 features.

Hi, TMosh, thanks for your answer.
all right, it sounds ok.

So, for instance, if I have 1000 features, what should I turn into non-linear? Should I do all the different combinations? This sounds like I would spend a lot of time trying to see what is the best feature to select. I would have lots of different possible combinations.

What is the right hand rule?

Hello @Aziadocs

You already have an inkling about the herculian task ahead with the different possible combinations to model a 1000 features. As @TMosh put it:

Which is a serious caution that we cannot casually skip the lower order polynomial combinations and directly jump to the more higher order polynomials.

So, hold on to this thought and the concern…but rest assured it gets better from here on. As you step into Course 2, Prof. Andrew will introduce more Advanced Models where you will get to see first hand how this concern can be mitigated .

There are better techniques ahead.