In linear regression, I understand that we divide the sum of loss by 2m instead of m to make the derivations mathematically more convenient, but in the cost function of logistic regression, it shows that we only divide the sum of loss by m and not 2m, why is this? Would it not also make later calculations simpler like in linear regression if we used 2m? Thanks

In linear regression, the cost function is MSE, which is â€śMean Squared Errorâ€ť. So when you take the derivative of the error squared, you get a factor of 2.

The loss function for Logistic Regression does not involve squaring anything: it involves logarithms, right? So a factor of \frac {1}{2} would not give you any simplification: it would just be extra baggage.

Note that multiplying the loss by a constant does nothing in terms of what solution you end up getting: the same input point that minimizes J will also minimize 2J, right? As you say, itâ€™s just a matter of mathematical convenience by making the formulas simpler.

1 Like