Hello @rkranjan,
I will let you decide whether you want to do the research You might look for some loss functions, and then take the derivative, and see what they end up. Making a table to summarize them would be wonderful. Your call.
However, we can rewrite the cost gradients for linear regression and logistic regression into this
which clearly shows us that the gradients are somehow proportional to the error, which makes a lot of sense, because in the other words, if the error is zero, the gradients are zero. This amazing property aligns with our intuition, doesn’t it?
Certainly it is an interesting fact that they share the same look! However, from their respective loss functions, we could also have a glimpse of that:
Linear regression, z is model prediction.
Logistic regression, p is model prediction.
Even though both of them do have the error term, they don’t actually look similar, do they? Unless we want to engineer a function g such that p = g(z), because in this way,
Logistic regression, where p = g(z)
While we have the freedom to engineer any g, what is better than a g that ends up making the bracketed term be 1? Because:-
- we get rid of the denominator
- this implies \frac{\partial{p}}{\partial{z}} = p(1-p) which again has a nice property that as p approaches to 1 or 0, this gradient approaches to 0
- it gives us the look of linear regression’s
It turns out that if we solve this equation \frac{\partial{p}}{\partial{z}} = p(1-p) by integration, we find that g is just our very familiar sigmoid function. It is only because we choose the sigmoid function as our g, we make the loss gradient of our logistic regression to look very similar to that of the linear regression. While there is no law in the nature that prohitbits any one from choosing another g, but if that person does, their loss gradient will no longer look like the linear regression’s.
Lastly, I don’t claim this was how historically the sigmoid had come into logistic regression, and I have never read that history either. These are just some logical statements.
If you choose to share it with us here, we can take a look!
Cheers,
Raymond