How we got derivative of dz[1]=w[2]T.dz[2]*g[1]`(z[1])

Shabbar_Zaidi · May 7, 2024, 9:33pm

The derivative of dz[1]=w[2]T.dz[2]*g[1]'(z[1]) is calculated in optional video Backpropagation Intuition(6:48). I am confused that why derivative of dz[1] is not equal to a[1]-y same like we did for dz[2] = a[2]-y.
How we get w[2]T.dz[2]*g[1]'(z[1]).

paulinpaloalto · May 7, 2024, 11:49pm

The formulas start out the same, but you get some simplification in the output layer case, because the derivative of sigmoid and the loss function work very nicely together. Mubsi and Eddy showed that special case for the output layer on this thread.

All this is basically just the Chain Rule, but applied to vectors and matrices. Prof Ng has designed these courses not to require calculus, so we just have to take his word for the formulas. If you have the math background to understand, here’s a thread with links to the derivations of all this.

Topic		Replies	Views
W3_A1_Derivative for hidden neural layers (Backprop) Neural Networks and Deep Learning coursera-platform	5	608	February 9, 2023
Week 3: Why dZ^[1] = W^[2]T dZ^[2] * g^[1]'(Z^[1]) Neural Networks and Deep Learning coursera-platform	3	905	February 13, 2023
How did we calculate dz[2] in Backpropagation Intuition (8:34)? Neural Networks and Deep Learning coursera-platform	1	645	March 6, 2022
Derivation of dz=da* g'(z) ? or dz= a- y? how is derivation of dz[1] and dz[2] different? Neural Networks and Deep Learning coursera-platform	10	971	June 1, 2023
Course 1: Week 3 (backpropagation intuition) Neural Networks and Deep Learning coursera-platform	21	5294	April 27, 2022

How we got derivative of dz[1]=w[2]T.dz[2]*g[1]`(z[1])

Related topics