Logistic Regression Cost Function Intuition start around 3:24

cstockman · March 25, 2024, 12:38am

Hello and thank you for your help.

I am trying to understand the intuition for the Logistic Error Function, Here is my idea based on the class.

If Y is 1 then Y hat should be Close to One so that the Loss(Y hat, Y ) is 0
If Y is 0 then Y hat should be close to Zero so that the Loss (Y hat, Y) is 0

Could you explain the intuition differently. I have tried reading some of the other forum discussion and thinking about it, but nothing has helped

paulinpaloalto · March 25, 2024, 12:57am

That sounds correct, but the point is that is just the first step. Then the question is, “ok, if that is your goal, then what is a function that can express that”. How does the log loss function help with that? Prof Ng goes on to explain that in the lecture.

Here’s a thread that shows the graph of the log function between 0 and 1 and discusses this a bit more.

Nevermnd · March 25, 2024, 2:43am

@cstockman as to intuition, for myself I find it clearer to understand this loss function by its alternate name, Cross-Entropy Loss, which in a system can be seen as a measure of ‘information’.

From Claude E. Shannon and Warren Weaver’s classic ‘The Mathematical Theory of Communication’ pp 12:

"Now let us return to the idea of information. When we have an information source which is producing a message by which successively selecting discrete symbols (letters, words, musical notes, spots of a certain size, etc.), the probability of choice of the various symbols at one stage of the process being dependent on the previous choices (i.e., a Markoff process), what about the information associated with this procedure?

The quantity which uniquely meets the natural requirements that one sets up for ‘information’ turns out to be exactly that which is known in thermodynamics as entropy. It is expressed in terms of the various probabilities of forming messages, and the probabilities that, when in those stages, certain symbols be chosen next. The formula, moreover, involves the logarithm of probabilities, so that it is a natural generalization of the logarithmic measure spoken of above in connection with simple cases."

Cross-entropy takes this a bit further in that it is comparing the probability distribution between two outcomes (as a binary classifier, here Y = 0 or Y = 1). Low loss (and thus low entropy) suggests a strong ‘signal’ that directs us to the right choice given the inputs provided, which is why we are trying, through repeated forward/back prop to try and get the network to minimize this loss and find the right combination of weights that gives us the strongest ‘signal’ one way or another.

(P.s. Shannon defines entropy, but not cross-entropy. Still a good read).

cstockman · March 25, 2024, 1:12pm

Thank you both…

I will be looking these over again to gain more intuition.

Topic		Replies	Views
Derivation for log loss function in classification Calculus for Machine Learning and Data Science week-3	9	769	April 16, 2024
Entropy function and logistic lost Advanced Learning Algorithms week-4	3	608	October 30, 2022
Suggestion for introduction of the loss function in "Logistic Regression" Neural Networks and Deep Learning week-2	2	14	January 8, 2025
Week 1, lab 2, counting labels and weighted loss AI for Medical Diagnosis week-1	3	383	November 10, 2023
Cost Function and Loss Function Supervised ML: Regression and Classification week-3	10	839	September 20, 2023

Logistic Regression Cost Function Intuition start around 3:24

Related topics