Hello, I must have missed when this was explained, but navigating to figure out where would be a long process.
I know 1/2m was used to define cost function for linear reg just so to make the formula easily divisible, but 1/m is now used for log regression, and i cannot remember why?
Has it been cancelled out somewhere?
Hi, you need to provide the exact place. For instance, the video lesson with timestamp, if possible, so the mentors can take a look and provide an explanation.
The point is that the cost function is different for logistic regression, because it is doing a binary (yes/no) classification. So you can’t use the MSE (mean squared error) loss function. Instead we use the “cross entropy” or “log loss” cost function. Taking the derivative of that function does not produce a factor of 2, because it does not involve squaring anything.
In linear regression, 2 in the denominator of cost function cancels out with the 2 we get after differentiating the squared error term while performing optimization using gradient descent.
But there is no any squared term in the cost function for logistic regression, that means we will not get any 2 in numerator while performing gradient descent, so we don’t need to put 2 in the denomonator.
Thanks. This really explains it
Thanks a lot. I get it now