Query about Squared Error Cost Function

Ujjval · November 24, 2025, 6:54pm

Can anyone please tell me why are we squaring the difference between the “Prediction” and “target”? It makes sense that we are summing up all the errors for every “i”th training set and taking the average of that.

But the reason we are taking average is to avoid a bigger calculation of the error given there are thousands of training set, so isn’t doing the square results in bigger number too. Why are we doing square in the first place? Is it because so that we can get the difference between “y-hat” and “y” always positive?

?

TMosh · November 24, 2025, 7:09pm

Squaring the errors allows several benefits:

Both positive and negative errors are handled.
The contribution from very large errors is emphasized.
The squared-error cost function is known to be convex, so there will only be one minimum.

Topic		Replies	Views
Cost Function Squared Error Machine Learning Specialization	1	115	July 19, 2022
Question: why does the cost function squares the error and not just take the absolute value? Supervised ML: Regression and Classification week-module-1 , coursera-platform	3	68	July 5, 2025
Week 1 - Question about Cost Function AI Discussions ai-discussions	2	37	June 4, 2026
Cost function : Mean squared error Supervised ML: Regression and Classification week-module-1	4	1021	December 12, 2022
Issues with Large Values in the Cost Function Supervised ML: Regression and Classification week-module-1	4	515	October 7, 2022

Query about Squared Error Cost Function

Related topics