Thoughts on Learning Rate Derivate in Gradient Descent for Logistic Regression

tobiademola · September 21, 2025, 7:58pm

Link: Coursera | Online Courses & Credentials From Top Educators. Join for Free | Coursera

Professor Ng established that we can write the cost function as the average of the sum of the calculated losses for each example. He rewrote the cost function as depending on the loss function and defined loss functions for logistic regression and linear regression. I thought that to compute the learning derivative for gradient descent for logistic regression, we would calculate the derivative of the loss function, that is, the derivative of equation (2) in the attached diagram or would the derivative have the same effect as the cost function described?
Could someone help clarify this?

TMosh · September 21, 2025, 8:29pm

Both methods should give the same result.

rmwkwok · September 22, 2025, 1:41am

Hello @tobiademola

I am not following because it’s difficult to understand how the effect of derivative and cost be “the same”. Derivative drives gradient descent. Cost tells the error. They have different uses.

However, I understand this one, and it is a YES.

I am not sure the following is relevant to your question, but I just feel like to write down:

Loss is for one sample. Cost is for all samples and is the ~~sum~~ average of all losses.
Derivative of the loss is for one sample. Derivative of the cost is for all samples and it is also the ~~sum~~ average of all derivatives of loss.

Cheers,
Raymond

paulinpaloalto · September 22, 2025, 1:55am

Right! The derivative of the sum is the sum of the derivatives. And the derivative of the average is the average of the derivatives. Because taking derivatives is a linear operation.

rmwkwok · September 22, 2025, 1:57am

Thanks Paul! It’s average and it’s not sum. Sometimes I write it that way and it’s bad. I should correct those in my post.

Topic		Replies	Views
How to get the derivatives of the logistic cost/loss function [TEACHING STAFF] Supervised ML: Regression and Classification week-module-3	18	4188	May 9, 2024
Why isn't the actual loss function for logistic regression not put in place of cost function while implementing gradient descent? Shouldn't the cost function containing the log function be partially differentiated? Supervised ML: Regression and Classification week-module-3	9	913	October 10, 2022
Why logistic regression is not used to calculate gradient descent Supervised ML: Regression and Classification week-module-3	8	375	May 8, 2024
Why the derivative of the cost functions used in gradient descent is the same for linear and logistic regression, while the costs function that are derived originally different? Supervised ML: Regression and Classification week-module-3	2	44	November 17, 2024
A doubt when applying gradient descent on logistic regression Supervised ML: Regression and Classification week-module-3	2	436	June 3, 2023

Thoughts on Learning Rate Derivate in Gradient Descent for Logistic Regression

Related topics