Gradient Descent on logistic regression

Bio_J · April 10, 2024, 6:03am

Hi team, I have some problem of understanding the following concept

What does dwi-(xi, yi) means here?

" So, with all of these calculations, you’ve just computed the derivatives of the cost function J with respect to each your parameters w_1, w_2 and b. Just a couple of details about what we’re doing, we’re using dw_1 and dw_2 and db as accumulators, so that after this computation, dw_1 is equal to the derivative of your overall cost function with respect to w_1 and similarly for dw_2 and db. So, notice that dw_1 and dw_2 do not have a superscript i, because we’re using them in this code as accumulators to sum over the entire training set."

I an not getting what it means by " we’re using dw_1 and dw_2 and db as accumulators, so that after this computation, dw_1 is equal to the derivative of your overall cost function with respect to w_1 and similarly for dw_2 and db."

does it mean like this

represents general derivatives of dj/dw cuz it’s the average?

balaji.ambresh · April 10, 2024, 7:55am

Can you try the 2nd assignment of week 2 (Logistic_Regression_with_a_Neural_Network_mindset.ipynb) till exercise 6 and reply to this thread if you have further questions?

paulinpaloalto · April 10, 2024, 9:14pm

Yes, J is the average of the L values over all the samples. But note that taking the average is a linear operation, so the derivative of the average is the average of the derivatives, right? Think about it for a second and that should make sense.

For the first question, I think you’ve drawn the circle in the wrong place. The point is that

dw_1^{(i)} = \displaystyle \frac {\partial L(a^{(i)}, y^{(i)})}{\partial w_1}

That is what he’s showing with the sideways curly bracket there. And you use (x^{(i)}, y^{(i)}) to compute that. The formula for how to do that is on the second slide you show.

Also as Balaji says, this will become several steps more “concrete” when you go through the instructions and the code in the Logistic Regression Assignment.

Topic		Replies	Views
Logistic Regression on m examples Neural Networks and Deep Learning	2	565	May 12, 2021
Derivation of formula for dZ[2] Neural Networks and Deep Learning	2	591	May 19, 2023
Week3: Derivations of J(w,b) for sigmoid function equal to quadratic linear function? Supervised ML: Regression and Classification week-2	2	512	May 30, 2023
Why use `average` when vectorizing the backpropagation calculations(C1_W4, page17) Neural Networks and Deep Learning	3	373	August 17, 2023
Course 1: Week 2 - Derivation of the cost function J with respect to w and b Neural Networks and Deep Learning	4	616	May 25, 2021

Gradient Descent on logistic regression

Related topics