Why do we still call it linear regression when we add polynomials?

Justin_Ko · April 27, 2023, 11:13pm

Hello community!

I have a question on nomenclature. In these classes we learn about linear regression, where we are trying to predict a value y from a single feature x, and the model is of the form.

y = w * x + b

It’s clear why this particular form is called “linear” regression. y is linear with respect to x.

We also talk about adding polynomial features, so that the model might look like

y = w1 * x + w2 * x^2 + b

We still refer to this as a linear model, but the relationship between x and y is no longer linear.

My question is, when we add these new polynomial terms why don’t we refer to this as polynomial regression or use a different name? There must be some history on where the name comes from that I have missed.

TMosh · April 28, 2023, 12:20am

“linear” refers not to the shape of the f_wb curve, but to the process of basing f_wb on the linear combination of weights and features. (f_wb = w*x + b). That’s a linear relationship between w and b and x.

In your example, x^2 is considered an additional engineered feature.

Justin_Ko · April 28, 2023, 3:44am

Hi @TMosh thanks for the quick reply. You said

In your example, x^2 is considered an additional engineered feature.

Does this mean we would consider an example like “y = w1 * x + w2 * x^2 + b” to not be linear regression? Since there is no longer a linear relationship between w and b and x?

Maybe an example of something that is non-linear is what I am looking for.

Justin_Ko · April 28, 2023, 4:55am

Found this page which provides an explanation and some examples of linear regression and non-linear regression.

TMosh · April 28, 2023, 3:07pm

It’s still linear regression, because all you’ve done is add another feature.
Once you compute x^2, it’s just a feature (a specific real value) just like any other feature.

Degeye · April 30, 2023, 10:39am

Hi justin_Ko,
Your question is very interesting and this is an aspect that is not often explained. Once the feature x is instantiated (or mapped) with its value in the dataset, we have to see it as a simple value (forgetting the way it was created and the function applied). The variable in this equation is the weights that will evolve during the training phase.

Christian_Simonis · April 30, 2023, 6:46pm

In addition to @TMosh’s excellent reply:

You could also formulate:

as

y = w_1 \cdot x_1 + w_2 \cdot x_2 + b, which is a linear model y = X \cdot w

with your matrix X, consisting of your features
- x_1 = f_1(x) = x and
- x_2 = f_2(x) = x^2.
your features are well defined and you parametrise your weights by fitting the model
In general, f(x) could be any suitable nonlinear function or model for each feature to encode domain knowledge! This strategy would mean to model the nonlinearity of your modelling problem in your features. (Of course f(x) could also just be a linear function as in f_1(x)).

These threads might be interesting for you:

Hope that helps!

Happy learning and best regards
Christian

gnart · September 20, 2023, 10:51pm

I was wondering the same question and found the lab4 of week 2 is quite helpful explaining the concern.
It is polynomials features and we still using linear regression model to train

An Alternate View
Above, polynomial features were chosen based on how well they matched the target data. Another way to think about this is to note that we are still using linear regression once we have created new features. Given that, the best features will be linear relative to the target. This is best understood with an example.

I still need to understand what is Non-linear regression to complete the picture though.

TMosh · September 20, 2023, 10:58pm

We’re using the linear combination of features that were generated using polynomial terms.

Once the new features are computed, they’re just constants, and can be combined linearly.

Personally I don’t think non-linear regression is a different thing. It’s just identifying that the model isn’t a straight line in 2D.

gnart · September 20, 2023, 11:09pm

So for non-linear regression, an example could be y = cos(x) ?

TMosh · September 20, 2023, 11:16pm

Not exactly.

cos(x) might be a nice way to generate some new features, but you would not call them ‘y’.
‘y’ is used for the labels of the dataset.

Recall that when you’re doing regression, you just have some input features and some output labels. You don’t know how they were generated. So you don’t really want to try to specify the form of the output. You want to create more complicated features (from non-linear processes) so that the model can be more complex.

gnart · September 20, 2023, 11:23pm

Yeah, realized that cos(x) is just another feature engineered. Thank for clarifying.

Topic		Replies	Views
Optional lab: Feature engineering and Polynomial regression Supervised ML: Regression and Classification week-module-2	1	549	July 11, 2022
Why polynomial regression is called linear discussion? AI Discussions	1	67	May 20, 2023
CW W2 Lab 4: Creating feature vs changing model Supervised ML: Regression and Classification week-module-2	14	476	May 20, 2023
Polynomial Regression for Housing Price Prediction Supervised ML: Regression and Classification week-module-2	1	426	June 15, 2023
Practice quiz: Gradient descent in practice Q5 Supervised ML: Regression and Classification week-module-2	4	981	January 25, 2023

Why do we still call it linear regression when we add polynomials?

Related topics