The following question is in reference to Course 1, week 4:

I’ve gone through the lectures several times now, and I haven’t seen the derivation of dZ^[1]. It’s also not covered in the “Backpropagation Intuition (Optional)” video. The equation is presented, but its derivation is never discussed. It’s an element-wise multiplication of two matrices and its form completely breaks the symmetry of the other gradient equations.

The equation in question is as follows:

dZ^[1] = ((W^[2]).T) (dZ^[2]) * g’(Z^[1]), where * is element-wise multiplication

What is ((W^[2]).T) (dZ^[2])? I’ve not seen W.T multiply dZ at any other point in the course.

Was this derivation intentionally skipped because of its difficulty level? Can you point me to a reference in which this gradient is derived? I’d really like to complete the picture on its derivation. Thanks in advance.