Cost function : Mean squared error

Nikhilesh · December 11, 2022, 5:09am

What is the logic behind taking squares of error distance and not absolute value, as both get rid of negative sign. Since we are taking average squared error to overcome large calculations ( as prof said) , we could have just taken absolute values. Please help. Thank you

TMosh · December 11, 2022, 5:40am

We use the square of the errors distance because:

We need the cost function to have a continuous partial derivative, because that’s how we find the gradients that we need for gradient descent. The absolute value does not have a continuous partial derivative.
The squared error cost function has a very simple partial derivative which is easily computed.
The squared error cost function emphasizes the correction of large magnitude errors.

Nikhilesh · December 11, 2022, 6:52am

Thank you.
So this is what I understood - Since the absolute value function is not differentaiable at x=0 , therefore we are using squared error ( a parabola) .Is this correct ?

TMosh · December 11, 2022, 11:44pm

That is only one of the reasons.

rmwkwok · December 12, 2022, 12:08pm

Hello!

In my opinion, I think we would be going too far to judge why squaring but not taking absolute value here, given that @Nikhilesh didn’t even mention what problem is at hand.

For the purpose of the course, given the great properties that Tom mentioned, I think it is sufficient for us to use the squared error as a starting point. We also can’t forget that, using the squared error in a linear regression problem, it will guarantee us only one minimum cost. This simplifies the discussion and we could focus on what the course was targeted to deliver.

@Nikhilesh, your reply is about Tom’s first point. Also, we can’t just say which one is absolutely better than the other here without any details about the problem you are facing. For example, like what Tom said in his third point, if you don’t want to emphasize on large magnitude errors, then you may want to try another loss function.

Raymond

Topic		Replies	Views
Mathematical proof for the cost function Supervised ML: Regression and Classification week-1	3	687	June 21, 2022
Issues with Large Values in the Cost Function Supervised ML: Regression and Classification week-1	4	491	October 7, 2022
Loss function to check error Calculus for Machine Learning and Data Science week-3	8	473	February 10, 2023
Cost function formula Supervised ML: Regression and Classification week-1	3	503	January 29, 2023
What are alternatives to Squared Error for the cost function? Supervised ML: Regression and Classification week-1	3	553	July 2, 2022

Cost function : Mean squared error

Related topics