I understand that we are performing chain rule to find dz[2] which is, I think, precisely dz[2]/da[1] * da[1]/dz[1]. I am not sure if I understand how we get w[2].T*dz[2] for dz[2]/da[1]. Plz correct me if I’m wrong

This course is specifically designed not to require any knowledge of Calculus, so Prof Ng does not show the derivations of any of the gradients. Here’s a thread with some links to websites that cover these topics.

Here’s another thread which gives some derivations involving calculus, although it does not address the dZ^{[2]} formula.