Week 2 Logistic Regression Gradient Descent video: derivative mistake

Steven98 · December 21, 2021, 5:04pm

In the video Logistic Regression Gradient Descent there is a mistake in the derivative of the cost function (“da”). Since the lost function use the Log function (not Ln function) the derivative must ve divide by ln(10). “da” = -( y/a + (1-y)/(1-a) )/ln(10)

paulinpaloalto · December 21, 2021, 5:59pm

The formulas shown are correct. What you need to realize is that the notational conventions are different in the ML/DL world than they are in the math world. Whenever they say “log” here, they mean natural logs, not logs base 10. I’m not sure where this convention came from, but one conjecture is that it’s based on the way things work in MATLAB, which was pretty commonly used in the early days of ML. Of course python has subsequently taken over the world. Note that np.log is also natural log. In fact python uses the same naming as MATLAB: if you want logs base 10, the function is np.log10.

Of course the point you raise also illustrates why you’d be nuts to use logarithms to any base other than e if you’re going to be taking derivatives or integrating. You just get inundated with bogus constant factors and of course the shapes of the curves are the same in any case, so you’d only get pain from using base 10 and no actual advantage.

andrzej_pietrusiak · January 27, 2022, 11:33am

Thanks for Good answer, would be nice to see it mention during the course.

andrzej_pietrusiak · January 27, 2022, 1:34pm

I withdraw my point, there is a link in the coure to full derivation where this is noted.

Alancaster · March 19, 2023, 5:15pm

It seems interesting that people continue to write it in this way. “nl()” is fewer characters to write/type than “log()”. There must be a reason why people don’t just start writing it the more mathmatically correct notation. Do you have any insight into why that is?

paulinpaloalto · March 19, 2023, 5:27pm

No, sorry, I don’t know the history of why the notation is different in the ML world than in the math world. You could argue that the math world is the one that’s backwards. Who really cares about logs base 10, once you start doing calculus? They make no sense: you get no behavioral advantage and it just makes a hideous mess with bogus constant factors at every turn.

One possible theory is that in MATLAB, which was the most common language used for ML work in the earlier days, the function names are log for natural log, log10 for base 10 logs and log2 for base 2 logs. And note that MATLAB was created by real mathematicians for doing programming that includes serious mathematics.

Topic		Replies	Views
Calculation of partial derivative of the cost function for logistic regression Supervised ML: Regression and Classification week-module-3	60	177	February 25, 2025
Help! What is the base of log in logistic regression activation function? Neural Networks and Deep Learning coursera-platform	15	988	February 18, 2025
Help on derivatives for the loss function Neural Networks and Deep Learning week-module-2 , coursera-platform	2	801	March 24, 2024
C1_W3_cost-function-for-logistic-regression_log-base-what Supervised ML: Regression and Classification week-module-3	4	569	July 18, 2022
[Week 2][Quiz] Logistic Loss Neural Networks and Deep Learning coursera-platform	4	617	January 14, 2024

Week 2 Logistic Regression Gradient Descent video: derivative mistake

Related topics