Backward propagation derivation

In addition to Raymond’s excellent points, note that the forward propagation formulas are different for neural networks than for Logistic Regression. They are:

Z^{[l]} = W^{[l]} \cdot A^{[l-1]} + b^{[l]}
A^{[l]} = g^{[l]}(Z^{[l]})

The salient point being that there are no transposes involved. Here’s a thread which goes into a bit more detail on how Prof Ng got to that formulation.

Have you actually watched all the lectures in Week 3 and done the assignment? I would recommend doing the assignment first, before you go back and try to derive all the formulas from first principles. It will help to make sure we are clear on what the formulas actually are before we go to the trouble of deriving them again.

1 Like