Dear Mentor,

I don’t understand why (x^(i), y^(i)) is substracted from dw^(i)_1 in below euqation, because I think that a derivative of L(a^(i), y^(i)) for w_1 is dw^(i)_1.

Hi, I don’t think Andrew Ng is implying that you need to subtract those, he is mentioning that you need to compute the derivatives of dw1(i) and refers to x(i) and y(i) when mentioning how to compute it with a single training example.

“So, it turns out that the derivative, respect to w_1 of the overall cost function is also going to be the average of derivatives respect to w_1 of the individual lost terms. But previously, we have already shown how to compute this term as dw_1_i, which we, on the previous slide, show how to compute this on a single training example.”

Oh, I understood the meaning of that notation. Thank you for your help.