Derivation of dz=da* g'(z) ? or dz= a- y? how is derivation of dz[1] and dz[2] different?

  • in backpropogation intuition of week 3 , it is written as dz=da*g’(z)

but in the previous videos dz = a-y , how did dz=da*g’(z) come up??

sir has told that, it has come by chain rule, but i dont understand how…

for dz[1]
image

for dz[2]
image

how is dz[1] = da*g’(z) , but dz[2]=a[2]-y ?

Check this, and this, and this.

1 Like

I still have a doubt at the end , what would be the derivative of da[1]/dz[1] ?

if it is just g’( z[1] )

then ,how is dz[1] = w[1]T * dz[2] * da[1] * g’(z[1]) ?

even after checking the chain rule, i am not able to get to this as sir mentioned, please give a elaborate solution

dL/dz[1] = dL/da[2] * da[2]/dz[2] * dz[2]/da[1]/ *da[1]/dz[1]

dL/a[2] * da[2]/dz[2] => dL/dz[2] => a[2]-y

dz[2]/da[1] = w[2]

what is da[1]/dz[1] ?

how is dz[1] = w[1]T * dz[2] * da[1] * g’(z[1]) ?

How do you get this?

Derivative of dZ1 is:

\frac{dL}{dZ1} = dZ^{[1]} = W^{[2][T]}.dZ^{[2]}* g^{[1]} {’} (Z^{[1]})

You already saw the chain rule.

I am asking you to elaborate the chain rule,

dz[1] = w[1]T * dz[2] * da[1] * g’(z[1])

is written from the sirs classes, as i have written my steps above, i am not able to get this as a conclusion

these are my steps:

dL/dz[1] = dL/da[2] * da[2]/dz[2] * dz[2]/da[1]/ *da[1]/dz[1]

dL/a[2] * da[2]/dz[2] => dL/dz[2] => a[2]-y

dz[2]/da[1] = w[2]

what is da[1]/dz[1] ?

i wanted elaborate steps from the starting of the chain rule

From which class do you get this? Can you share a link with us?

Regarding the explanation of the Chain Rule, it just needs your familiarity with Calculus. If you are not familiar with it, check this guide by Paul which refers to some other links.

1 Like

this is the class link,

anyways the guide by Paul sir, finally explains my doubt. thankyou

I am glad your doubts are cleared now.

However, a link which you gave never say that dz[1] = w[1]T * dz[2] * da[1] * g’(z[1]). It is a better idea to watch the video again…

dz[1] = w[1]T * dz[2] * da[1] * g’(z[1]).

it said the same, i have already attached the screenshot in the question itself , please check the beginning of the thread

I’ve checked the screenshot, it said dz[1] = w[2]T * dz[2] * g[1]’(z[1]). This is different from the one you mentioned.
Maybe it is a good time for you to watch that video again, or at least the screenshot which you shared.

Yes, as Saif has pointed out, the first operand on the RHS there is incorrect and should be W^{[2]T}. There’s another important error there as well: the operation between W^{[2]T} and dZ^{[2]} is not * (elementwise multiply), but dot product. Check the dimensions and it will be obvious that elementwise would not work:

W^{[2]} is n^{[2]} x n^{[1]}
dZ^{[2]} is n^{[2]} x m

And of course we need the dimensions of dZ^{[1]} to be n^{[1]} x m, right?

There is also no dA^{[1]} term required, since the first two terms actually give us dA^{[1]}. The formula is:

dA^{[1]} = W^{[2]T} \cdot dZ^{[2]}

Putting all that together with some parentheses to make it unambiguous, we have:

dZ^{[1]} = \left ( W^{[2]T} \cdot dZ^{[2]} \right ) * g^{[1]'}(Z^{[1]})

You can see that the dimensional analysis all works with that formulation.