Cost function: why mean squared error instead of least squares?

ybakos · October 30, 2022, 11:05pm

I was surprised to see Dr. Ng leads us to the mean squared error cost function instead of least squares. When learning about linear regression in a stats context or “line of best fit,” I was taught the “residual sum of squares” and the “least squares” procedure.

Why is “mean squared error” used here instead of “least squares”? Would the w and b computed by the mean squared error approach be the same as the w and b computed by a least squares approach?

Computing w and b for least squares seems pretty straightforward, “formulaic,” and procedural… is it slower or faster than using gradient descent and the mean squared error?

rmwkwok · October 31, 2022, 3:36am

Hi Joe @ybakos ,

This specialization is preparing learners for working with deep neural network which in general can’t be solved by the least square method. Instead of the least square method, the gradient descent method can be used on a neural network, and so we learn gradient desent

Raymond

PS: Mean squared error is just one of the many objective functions you may use with the gradient descent method.

Topic		Replies	Views
Cost Function and Gradient Descent Supervised ML: Regression and Classification week-1	3	327	October 27, 2023
Least square and cost function in the course Supervised ML: Regression and Classification week-1	1	483	March 13, 2023
Cost function : Mean squared error Supervised ML: Regression and Classification week-1	4	933	December 12, 2022
Does gradient descent of cost function give the same regression line as ordinary least squares? Supervised ML: Regression and Classification week-1	5	535	September 27, 2022
What are alternatives to Squared Error for the cost function? Supervised ML: Regression and Classification week-1	3	557	July 2, 2022

Cost function: why mean squared error instead of least squares?

Related topics