The Cost function sign (- or +)

Ali_Ghadimi · February 5, 2022, 2:10pm

Hi

I have a question about the cost function sign.

Video: Explanation of Logistic Regression Cost Function (Optional) explained what the sign of cost function must be positive:

but in the notes and programming assignment, the cost function is negative

and also here

and the test doesn’t pass with a positive cost function, which one is correct?

kenb · February 5, 2022, 2:53pm

Hi @Ali_Ghadimi. There appears to be a dropped minus sign on that first slise that you show. On the slide, Prof Ng is showing that cross-entropy loss can be derived from the maximum likelihood principal: given the data (assumed to be drawn from the “correct” distribution) what parameters are most likely to explain/predict the data. In other words, which parameters maximize the (log) likelihood function?

The underlying distribution in the (log) likelihood function (at top) is the Bernoulli (Binomial) distribution–the basic distribution for a weighted coin toss. In the AI disciplines, typically the problem is couched in terms of a loss function. What are the parameters most likely to minimize the loss associated with deviations from the actual data. Hence, the objective function is multiplied by -1 to turn it into a minimization problem.

kenb · February 5, 2022, 3:17pm

Postscript: To my mind the script-L function \mathcal{L} is suggestive of “loss” and so should include the minus sign. Prof Ng goes another way, but drops the minus sign in the last line (assuming that this snapshot was not taken a second before the minus sign appears) as if he too fell to the ambiguity. You confusion is quite understandable. Paraphrasing LaPlace (I think), “half the battle of mathematics is the invention of a good notation.”

paulinpaloalto · February 5, 2022, 5:09pm

Right! The underlying point here is that the loss involves the logarithm of numbers between 0 and 1. Those logarithms are negative, so we multiply by -1 to get a positive cost value.

Ali_Ghadimi · February 5, 2022, 5:36pm

Prof Ng, explains here why we need to drop the minus sign, but in the lecture notes, the minus sign is there… and also in the code…

paulinpaloalto · February 5, 2022, 5:58pm

Yes, the slides are a bit confusing. But you just have to keep in mind what I said in my previous reply: the logarithms are all negative and we need the cost to be positive. So it’s just a question of where you put the minus sign: on the individual terms, inside the parens inside the sum, outside the summation (factored out) or incorporated into the definition of L(\hat{y}^{(i)}, y^{(i)}).

Topic		Replies	Views
W2_Video Lecture_Cost Function For Logistic Regression Neural Networks and Deep Learning coursera-platform	4	476	July 1, 2023
Week 2 video 3 cost function Neural Networks and Deep Learning coursera-platform	7	503	August 17, 2023
Why in the formula we are multiplying by - sign? Supervised ML: Regression and Classification week-module-3	4	517	January 11, 2023
Minor error in video - Course 1, Week 2 Neural Networks and Deep Learning coursera-platform	3	537	March 15, 2022
Week 2 : Logistic Regression Cost Function Video Neural Networks and Deep Learning coursera-platform	1	634	May 7, 2021

The Cost function sign (- or +)

Related topics