Hey I’m just curious, as said by Mr. Andrew, we use the cost function with the squared error for linear regression models but for other models we may use some other formulas.It sparked my curiosity as to why we use the squared error in general instead of just calculating the difference between y^ and y and why do we square it in linear regression and it is different for other models.
Any feedback would be appreciate but I would highly thank someone that could explain this to me in layman terms without much computer nomenclature to make it a bit more understandable for me.
Thanks again!
1 Like
An advantage of squaring the error is to keep them all non-negative so that we can sum the errors up without them canceling each other. Assume we have the following list of errors [1, -1, 2, -2, 3, -3, 4, -4]
, if we don’t square and just add them up, the cost will become zero which is unwanted because we can’t use this unsquared cost to observe existing errors.
We use squared loss for linear regression, but it can also be used in other regression models such as a regression neural network or a regression xgboost trees.
Similarly, log loss is not only used by logistic regression, but can be used in other classification models such as a classification neural network or a classification xgboost trees.
Cheers,
Raymond
1 Like