In the video “Vectorizing Logistic Regression’s Gradient Output”, I didn’t understand the explanation of the vectorization of dw:

I understand why the resulting vector is n x 1, but it seems to be equal to:

[ x_11 * dz_1 + x_12 * dz_2 + … + x_1m * dz_m

…

x_n1 * dz_1 + x_n2 * dz_2 + … + x_nm * dz_m]

which is not equal to multiplying vectors x_1 with dz_1, x_2 with dz_2, … x_m with dz_m.

Can you please explain?

1 Like

He’s just writing out the effect of that dot product in the first line. Remember that the dimension of dZ is 1 x m and each of the vectors x^{(i)} is an n x 1 vector and there are m of them, so each element of that sum on the last line is an n x 1 vector multiplied by a scalar and then you add them up and it results in an n x 1 vector. n x m dotted with m x 1 gives you an n x 1 result, of course.