Please explain $dz^{[1]} = {W^{[2]}}^{T} dz^{[2]} \times {g^{[1]}}^{'}(z^{[1]})$ in backpropogation

paulinpaloalto · November 3, 2024, 4:56am

Things work out differently at the output layer because of the way that the derivative of sigmoid and the derivative of the cross entropy loss function work nicely together. Here’s a thread which shows that. At the inner layers of the network, you don’t get that nice simplification.

But note that derivations involving calculus are beyond the scope of these courses. If you have some math background, here’s a thread with links to more information on the derivations of back propagation.

Topic		Replies	Views
Course 1: Week 3 (backpropagation intuition) Neural Networks and Deep Learning coursera-platform	21	5404	April 27, 2022
Derivation of dz=da* g'(z) ? or dz= a- y? how is derivation of dz[1] and dz[2] different? Neural Networks and Deep Learning coursera-platform	10	978	June 1, 2023
W3_A1_Derivative for hidden neural layers (Backprop) Neural Networks and Deep Learning coursera-platform	5	621	February 9, 2023
Back propagation 1 box Neural Networks and Deep Learning week-module-4 , coursera-platform	3	129	May 29, 2024
BackPropagation Derivation Of 2 Layer Neural Network Neural Networks and Deep Learning week-module-3 , coursera-platform	1	251	March 3, 2024

Please explain $dz^{[1]} = {W^{[2]}}^{T} dz^{[2]} \times {g^{[1]}}^{'}(z^{[1]})$ in backpropogation

Related topics