I have seen in all the loss function lectures here Andrew sir is using - sign with log, why - is required and what would have if - sign is not used?
Hello @tbhaxor,
A consequence of taking out the negative sign is that the following curve will flip up side down so that a perfect prediction no longer corresponds to the minimum loss, but maximum. Therefore, our gradient descent algorithm has to be also adjusted into gradient ascent.
It could be called a loss function when the function’s value is the lowest when it is a perfect prediction. If we don’t have the negative sign, then it can’t be called a loss function.
Raymond
It’s because f_wb is between 0 and 1.
The log of values between 0 and 1 is always negative.
Cost must always be positive (by convention), so we negate the negative values to get a positive value.
So I tried to explore this mathematically a bit
In the loss function if we remove -ve sign then it will be ln(\frac{1}{1+e^{-z}}) \to -ln(1+e^{-z}) so to cancel out this -ve sign we have added the -ve sign before.
Is this one way of thinking?
I think the missing piece in your argument is - why do we want to cancel out that negative sign? We can’t just cancel it for the sake of canceling it. Without this piece, I can’t agree with your way.
Cheers,
Raymond