Gradient Descent on logistic regression

Hi team, I have some problem of understanding the following concept

  1. What does dwi-(xi, yi) means here?


" So, with all of these calculations, you’ve just computed the derivatives of the cost function J with respect to each your parameters w_1, w_2 and b. Just a couple of details about what we’re doing, we’re using dw_1 and dw_2 and db as accumulators, so that after this computation, dw_1 is equal to the derivative of your overall cost function with respect to w_1 and similarly for dw_2 and db. So, notice that dw_1 and dw_2 do not have a superscript i, because we’re using them in this code as accumulators to sum over the entire training set."

I an not getting what it means by " we’re using dw_1 and dw_2 and db as accumulators, so that after this computation, dw_1 is equal to the derivative of your overall cost function with respect to w_1 and similarly for dw_2 and db."

does it mean like this
image
represents general derivatives of dj/dw cuz it’s the average?

Can you try the 2nd assignment of week 2 (Logistic_Regression_with_a_Neural_Network_mindset.ipynb) till exercise 6 and reply to this thread if you have further questions?

Yes, J is the average of the L values over all the samples. But note that taking the average is a linear operation, so the derivative of the average is the average of the derivatives, right? Think about it for a second and that should make sense.

For the first question, I think you’ve drawn the circle in the wrong place. The point is that

dw_1^{(i)} = \displaystyle \frac {\partial L(a^{(i)}, y^{(i)})}{\partial w_1}

That is what he’s showing with the sideways curly bracket there. And you use (x^{(i)}, y^{(i)}) to compute that. The formula for how to do that is on the second slide you show.

Also as Balaji says, this will become several steps more “concrete” when you go through the instructions and the code in the Logistic Regression Assignment.