W3_Vectorization of dZ[2] equations

Shawn_Shan · March 30, 2023, 4:55am

The single loss function L derivative of z[2] is a[2] - y. Why the cost functioin J derivative of Z[2] is in the similar format to the single example equation (a[2] - y): A[2] - Y? Where is the 1/m of J going?

saifkhanengr · March 30, 2023, 6:32am

Hello @Shawn_Shan! Interesting question…

All the gradients (derivatives) are the chain rule. Are you familiar with it? So, when we add that 1/m with dW, because of the chain rule, it covers the dZ and dA too.

Shawn_Shan · March 30, 2023, 5:15pm

Thank you for your response. I understood chain rule. If dZ is written in the equation in the lecture, all the following derivatives makes sense. But dZ itself should be defined clearly. If dZ is dJ/dZ, it would need 1/m. and then dW, db will not need that 1/m term. I think in the lecture, dz is defined by dL/dz, and dZ should be defined by dSum(L(i)) / dZ instead of dJ/dZ.
See my derivation below.

saifkhanengr · March 31, 2023, 1:57am

Hello @Shawn_Shan! I asked the same question some time ago from our top mentor @paulinpaloalto. Here is his answer. Check it out and let me know what you think.

Best,
Saif.

Shawn_Shan · March 31, 2023, 2:22am

This is very helpful. Basically, dA or dZ is not dJ/dA but dL/dA. This makes sense for all the rest equations.

In fact, the very original question you asked in that thread is what i was trying to do in one of my projects (for linear regression and using ReLU as activation). That was a helpful thread for my work. Thank you

saifkhanengr · March 31, 2023, 2:24am

I am glad you like it.

Best,
Saif.

Topic		Replies	Views
Could someone help explain about this to me? Neural Networks and Deep Learning coursera-platform	1	634	May 24, 2023
DERIVATIVES OR J or L? Neural Networks and Deep Learning week-module-4 , coursera-platform	7	22	February 22, 2025
Week 3: wrong formula for the derivatives dZ[2] in videos and notebook Neural Networks and Deep Learning coursera-platform	4	806	August 20, 2022
Week 3,4: Why isn't 1/m part of dz^[L]? Neural Networks and Deep Learning coursera-platform	19	1342	December 6, 2022
BackPropagation Derivation Of 2 Layer Neural Network Neural Networks and Deep Learning week-module-3 , coursera-platform	1	260	March 3, 2024

W3_Vectorization of dZ[2] equations

Related topics