W3_A1_Ex-6_What's the link between dz[1] and w[2] equation?

paulinpaloalto · October 23, 2022, 10:00pm

As with everything here, it’s just an application of the Chain Rule. But the first thing to be clear about is the meaning of Prof Ng’s notation:

dz^{[1]} = \displaystyle \frac {\partial L}{\partial z^{[1]}}

So by the Chain Rule, we can write:

\displaystyle \frac {\partial L}{\partial z^{[1]}} = \frac {\partial L}{\partial a^{[1]}} \frac {\partial a^{[1]}}{\partial z^{[1]}}

Since of course we have:

a^{[1]} = g^{[1]}(z^{[1]})

That makes the second factor in the formula you highlight obvious:

\displaystyle \frac {\partial a^{[1]}}{\partial z^{[1]}} = g^{[1]'}(z^{[1]})

Then do the Chain Rule one more time on the first factor:

\displaystyle \frac {\partial L}{\partial a^{[1]}} = \frac {\partial L}{\partial z^{[2]}} \frac {\partial z^{[2]}}{\partial a^{[1]}}

z^{[2]} = W^{[2]} \cdot a^{[1]} + b^{[2]}

So that gives us:

\displaystyle \frac {\partial z^{[2]}}{\partial a^{[1]}} = W^{[2]T}

If you reassemble that with all the previous formulas and with a little hand-waving about dot products and transposes, you get the original formula as shown.

In the bigger picture, please note that this course is designed not to require knowledge of even univariate calculus, let alone matrix calculus, so Prof Ng does not owe us derivations of any of these formulas. The good news is you don’t need to know calculus, but that means the bad news is you just have to take his word for it. If you have the calculus background to understand, here’s a thread with links to more detailed derivations and background information about matrix calculus.

Topic		Replies	Views
Week 3: Why dZ^[1] = W^[2]T dZ^[2] * g^[1]'(Z^[1]) Neural Networks and Deep Learning coursera-platform	3	903	February 13, 2023
Week2 why dw1=x1dz Neural Networks and Deep Learning coursera-platform	4	721	August 3, 2022
The intuition of db^[l]=dz^[l] and da^[l-1]=w^[l-1].dz^[l] Neural Networks and Deep Learning coursera-platform	4	789	May 27, 2023
WK3 Backpropagation intuition formula demonstration Neural Networks and Deep Learning coursera-platform	4	557	June 27, 2022
Back propagation why do we start from dZ2 and why transpose Neural Networks and Deep Learning week-module-3 , coursera-platform	2	332	May 30, 2024

W3_A1_Ex-6_What's the link between dz[1] and w[2] equation?

Related topics