Question about gradient descent for neural network

I don’t understand that
anyone can explain

Here is my understand about this gradient but it’s not match with the video

My comments in above. Please fix the above first before we continue.


thanks for your reply , here is new version ( * is element-wise product)

In your dw2, shape of A1 is (something, number of samples), and shape of dz2 is equal to the shape of A2 which does not have the shape of (number of samples, something). The mismatching shapes mean they cannot be matrix multiplied. Could you please double check?

In your db2, dz2/db2 does not equal to scalar 1. what should it be?

In your dz1, why is 1/m missing? your dz2 comes with a 1/m.

thankyou for your help , I have learn more about matrix calculus and now I understand it !

It’s great, @Xuan_Thanh_Nguyen!