Gradience Descent Backpropagatin Calculation

I may be answering a different question than you are asking, but Prof Ng did give the formulas in the lectures for all the elements of the gradient calculations. Here’s the expression he gives for dZ^{[1]} in the Week 3 lectures for a specific 2 layer network:

dZ^{[1]} = ( W^{[2]T} \cdot dZ^{[2]} ) * g^{[1]'}(Z^{[1]})

Where g^{[1]} is the activation function for layer one, meaning that you need the derivative of that function.

If the question is, why is that the formula? That is beyond the scope of this course. Here’s a thread with links to the derivation of back propagation in general and references to the matrix calculus that is needed for the derivation.

1 Like