How does W^[2]T appear?

Hi everyone! I’m trying to understand how to get W^[2]T to appear when computing the backpropagation step. I manage to get dZ^[2], dW^[1], db^[2] correctly, but when computing dZ^[1] (which should be = W^[2]TdZ^[2]*g^[1]'(Z^[1])) I can’t get W^[2]T out.

I’m using as reference a two layer ANN with two inputs, two hidden units and 1 output unit, by forwarding an input matrix X with two sample (all dimensions as the ones described in the class “Gradient Descent for Neural Networks” by Andrew Ng in the first course of the Deep Learning Specialization). By my calculations I’m assuming this transpose should appear when computing the partial derivative of Z^[2] wrt A^[1], but this is giving me the following matrix:

[ [w_11^[2], w_12^[2]]_11^[2], w_12^[2]]
  [w_11^[2], w_12^[2]] ]  [w_11^[2], w_12^[2]] ]

Any help on what I’m doing wrong would be super appreciated. Here are my derivations:

Vectorized ForwardBackward Equations_240126_215851.pdf (3.6 MB)

Which course are you attending? You posted in “AI Questions”, but it sounds like a better fit in one of the course Q&A forums.

You can move your question to the correct course forum by using the “pencil” icon on the thread title.