dZ[1] derivation

Leandro_Pires · November 3, 2021, 10:00pm

Hi, everyone,

I don’t think I understand how
dZ[2] = A[2] - Y
but
dZ[1] = W[2]T . dZ[2] * g’[1](Z[1])

(week 3 material)
In my intuition I was expecting it to be something small like dZ[2] was. Can someone guide me to this conclusion?

Thanks in advance,
Leandro Pires.

paulinpaloalto · November 4, 2021, 2:58am

The formula you show for dZ^{[1]} is the generic formula that works at any layer. The specific formula you show for dZ^{[2]} corresponds to the case in which layer 2 is the output of a binary classifier with the cross entropy loss function. You can find the derivation of that in this thread.

If you want to see the derivation of the general formula for dZ^{[l]}, that is beyond the scope of this course. Please see this thread for some links that cover the derivation.

Topic		Replies	Views
How we got derivative of dz[1]=w[2]T.dz[2]*g[1]`(z[1]) Neural Networks and Deep Learning week-3	1	232	May 7, 2024
Could someone help explain about this to me? Neural Networks and Deep Learning	1	623	May 24, 2023
BackPropagation Derivation Of 2 Layer Neural Network Neural Networks and Deep Learning week-3	1	244	March 3, 2024
I don't know the difference between dZL = AL - Y and dZL = dAL .* g'(ZL) Neural Networks and Deep Learning	2	782	February 8, 2022
Derivation of dZ^[1] Neural Networks and Deep Learning	2	694	August 4, 2021

dZ[1] derivation

Related topics