I have seen in all the loss function lectures here Andrew sir is using - sign with log, why - is required and what would have if - sign is not used?

Hello @tbhaxor,

A consequence of taking out the negative sign is that the following curve will flip up side down so that a perfect prediction no longer corresponds to the minimum loss, but maximum. Therefore, our gradient descent algorithm has to be also adjusted into gradient ascent.

It could be called a loss function when the function’s value is the lowest when it is a perfect prediction. If we don’t have the negative sign, then it can’t be called a loss function.

Raymond

It’s because f_wb is between 0 and 1.

The log of values between 0 and 1 is always negative.

Cost must always be positive (by convention), so we negate the negative values to get a positive value.

So I tried to explore this mathematically a bit

In the loss function if we remove -ve sign then it will be ln(\frac{1}{1+e^{-z}}) \to -ln(1+e^{-z}) so to cancel out this -ve sign we have added the -ve sign before.

Is this one way of thinking?

I think the missing piece in your argument is - why do we want to cancel out that negative sign? We can’t just cancel it for the sake of canceling it. Without this piece, I can’t agree with your way.

Cheers,

Raymond