Cost_function formula_Difference between 2*m & m?

Hello Jordhan,

Regarding your query over the use of 2m and m, I would just say that it’s a convenient method to get the computation done for gradient descent. The derivative term of the square function gets cancel out with 1/2 term.

The 1/m averages the squared error over the no. of components to reduce its impact on the function.

Here’s a link that will add more insights.

1 Like