What is the difference between LinearRegression and SGDRegressor?

Hello ,
I’d like to know What is the difference between LinearRegression and SGDRegressor?

Hi @Rhayem_Bannouri,

I believe you are referring to sklearn.linear_model.SGDRegressor and sklearn.linear_model.LinearRegression.

In short, they find solutions by different ways, and SGDRegressor is closer to the gradient descent that is covered in the course. For more, I can share how you can find out the difference yourself.

if you check out the SGDRegressor’s doc, it explicitly says

SGD stands for Stochastic Gradient Descent: the gradient of the loss is estimated each sample at a time and the model is updated along the way with a decreasing strength schedule (aka learning rate).

And you can find out more about its other configuration in the doc.

If you check out the LinearRegression’s doc, besides its configurables, you can also see it says that

From the implementation point of view, this is just plain Ordinary Least Squares (scipy.linalg.lstsq) or Non Negative Least Squares (scipy.optimize.nnls) wrapped as a predictor object.

So you may check out, for example, scipy.linalg.lstsq by googling, and find that lstsq provides a few drivers to solve the problem:

lapack_driver str, optional
Which LAPACK driver is used to solve the least-squares problem. Options are 'gelsd', 'gelsy', 'gelss'. Default ('gelsd') is a good choice. However, 'gelsy' can be slightly faster on many problems. 'gelss' was used historically. It is generally slow but uses less memory.

You may further google those names or check out the source code for the implementation behind those names. For example, gelss and gelsd are SVD-based methods.

Raymond