Hey everyone
Classroom Item: Week 3, Cost function for logistic regression
In this video, Andrew said that the logistic regression function {{f_w,b(x)}} predicts a value ranging from 0 to 1, then this value is used in computing the cost using logarithms. For example, if y (the actual output) was equal to 0 and the function predicts 0.5 or 0.7, then the cost will be very high. But the threshold can be y = 0 if f_w,b(x) <= 0.7 and 1 otherwise. So in this case the cost should be zero I think, as the algorithm predicted the actual output regardless of its value (0.5 or 0.7)
So, why is it necessary to use the function’s actual output (0.5 or 0.7) rather than the binary decision (0 or 1) at the decision boundary? From my understanding, the function f_w,b(x) should predict either 0 or 1 based on the chosen threshold and decision boundary.
Isn’t the final binary output (0 or 1) what’s actually used in prediction-making, rather than the continuous value of the function?
Hi @Tera_Byte
The reason is that logistic regression aims to predict probabilities rather than hard binary outputs. This makes the model to learn how confident it should be for each prediction. If we only used the final binary output for the cost, the model wouldn’t learn the degree of error when it’s unsure. This makes the model to push probabilities closer to 0 or 1, based on its confidence level.
Hope it helps! Feel free to ask if you need further assistance.
1 Like
@Alireza_Saei thanks for answering, your answer helped me understand but I have the feeling that I’m still missing something, what do you mean by “the degree of error when it’s unsure” and do you maybe have an example in mind that can expand on your answer?
You’re very welcome @Tera_Byte !
I mean that logistic regression not only tries to predict the correct class but also considers how confident
it is about that prediction.
For example, if the actual output is 0 and the model predicts 0.9 (strongly wrong) versus 0.4 (somewhat wrong), the cost function penalizes the 0.9 prediction more heavily because it’s more confident in an incorrect answer. This helps the model learn to be more precise, trying to push values closer to 0 or 1 based on its certainty.
Let me know if this clarifies things a bit more!
2 Likes
Yeah, that answer did it for me. Thanks again @Alireza_Saei
You’re welcome! happy to help