Week 3: wrong formula for the derivatives dZ[2] in videos and notebook

paulinpaloalto · August 20, 2022, 3:00pm

Where does it say that Prof Ng is required to be consistent in his notation? Those d values are just shorthands. It turns out that:

dW^{[l]} = \displaystyle \frac {\partial J}{\partial W^{[l]}}

You just have to understand the context to see why the formulas turn out the way that they do.

Keep in mind that literally the only dX values that are partial derivatives of J are the dW^{[l]} and db^{[l]} gradients. Literally every other value is a partial derivative of something different than J.

You have to think through how the Chain Rule applies when you compute the gradients of the W or b values at one of the inner layers of the network. The L to J transition is always there, but it’s literally the last step, right? You don’t want to end up with multiple factors of \frac {1}{m} …

Topic		Replies	Views
Derivation of formula for dZ[2] Neural Networks and Deep Learning coursera-platform	2	592	May 19, 2023
Week 3,4: Why isn't 1/m part of dz^[L]? Neural Networks and Deep Learning coursera-platform	19	1302	December 6, 2022
C4W1 CNN back propagation Convolutional Neural Networks coursera-platform	1	618	November 2, 2021
Optional video explaining backpropagation of C1 : dL/dZ[2] = A[2]- y? Neural Networks and Deep Learning coursera-platform	4	501	August 18, 2023
Week 3: Why dZ^[1] = W^[2]T dZ^[2] * g^[1]'(Z^[1]) Neural Networks and Deep Learning coursera-platform	3	903	February 13, 2023

Week 3: wrong formula for the derivatives dZ[2] in videos and notebook

Related topics