What's the usage of J(w,b) for logistic regression?

Jinyan_Liu · July 5, 2023, 10:54pm

I understand that because we need a convex for J(w,b), that’s why this equation with log is J(w,b) for logistic regression. And of course, it can be used to monitor the cost along the way. But I don’t see how it contributes to the logistic regression’s gradient descent. The gradient descent is very similar to linear regression’s, only f(x) is different.
Shouldn’t gradient descent for logistic regression somehow comes from logistic regression’s J(w,b)?
What’s the other usage of J(w,b) for logistic regression?

Schwan_Ray · July 6, 2023, 12:20am

I have the same question!
The partial derivative for linear regression was calculated directly from the squared cost function. Why isn’t the cost function used to calculate the partial derivative in logistic regression then?!?

rmwkwok · July 6, 2023, 12:53am

Hello @Jinyan_Liu , @Schwan_Ray ,

We use logistic regression’s cost function to derive logistic regression’s gradients as well. If you are looking for the steps to derive them, check out this post’s “Derivation steps 2: logistic regression”. You will see how it ends up the same look as the gradients for linear regression.

Cheers,
Raymond

TMosh · July 6, 2023, 2:18am

The cost equation was used to compute the solution for the gradients.

The cost value itself isn’t used except for monitoring, as you mentioned. The key to making gradient descent work is the code that computes the gradients.

Jinyan_Liu · July 6, 2023, 10:47pm

Thanks!
But in the steps, the loss function for both Linear and Logistic are just be called L. Although I don’t understand full of the steps, to me, L appeared and then disappeared. It looks like what actually in L does not matter for computing gradient descent?

Jinyan_Liu · July 6, 2023, 10:48pm

How is that done?

TMosh · July 6, 2023, 10:59pm

True, you don’t really need the loss value itself during gradient descent. It is handy to monitor that the cost is decreasing, but it isn’t essential.

It’s an application of calculus, starting from the cost equation, computing the partial derivatives of the cost with respect to the weights and bias. This gives the equations for the gradients.

Jinyan_Liu · July 6, 2023, 11:00pm

What I mean is: What’s actually in loss function L actually does not matter at all when you derive gradient descent, right?

Jinyan_Liu · July 6, 2023, 11:02pm

Does Loss function equation matter in the process?

TMosh · July 6, 2023, 11:02pm

The cost function (also called the loss L) is necessary because that’s where the gradients come from.

Jinyan_Liu · July 6, 2023, 11:05pm

Yes, it is how I understood it. It’s just when Loss and Cost function for Linear and Logistic Regression are so different, but their gradient descent are so similar make it very hard to understand what’s happening in the middle.

TMosh · July 6, 2023, 11:11pm

This is a happy circumstance, but it’s due to the presence of the sigmoid function in the predicted y-hat value for logistic regression, and how the logistic cost is defined, that makes it work out this way.

Jinyan_Liu · July 6, 2023, 11:19pm

Thanks! I will just think it as somehow it ends in this way. But gradient descent do get calculated from the cost or loss functions.

rmwkwok · July 7, 2023, 12:57am

Hi @Jinyan_Liu,

How do you think I get the underlined equation?

I used the Loss function for the Logistic Regression

Raymond

rmwkwok · July 7, 2023, 1:00am

Hello @Jinyan_Liu,

You don’t need to compute the loss to do gradient descent. However, you need to compute a metric (which can just be the loss itself) over the training set and the test set to monitor for how they change over iterations. You will learn about it in Course 2 Week 3.

Cheers,
Raymond

Jinyan_Liu · July 7, 2023, 2:50pm

Ah thank you! Great that you point it out for me! Thank you! I don’t have this math background, so couldn’t understand the steps all by myself!

Jinyan_Liu · July 7, 2023, 2:52pm

Thank you so much! Now I understand! And it’s so interesting that the Linear and Logistic regression’s gradient descent are so similar after calculation! Thank you!

rmwkwok · July 7, 2023, 7:56pm

No problem @Jinyan_Liu

fulorianarendra5 · June 9, 2024, 3:03pm

Thanks!!

Topic		Replies	Views
Logistic Regression Derivative of J(w,b) Supervised ML: Regression and Classification week-3	12	1051	May 16, 2023
Why logistic regression is not used to calculate gradient descent Supervised ML: Regression and Classification week-3	8	285	May 8, 2024
Why the linear regression and classification have identical Gradient Function? Supervised ML: Regression and Classification week-2	5	694	February 11, 2023
Misunderstandings On The Analytical Equations of GD In Logistic Regression Supervised ML: Regression and Classification week-3	3	467	January 8, 2023
Week3: Derivations of J(w,b) for sigmoid function equal to quadratic linear function? Supervised ML: Regression and Classification week-2	2	512	May 30, 2023

What's the usage of J(w,b) for logistic regression?

Related topics