Optional video explaining backpropagation of C1 : dL/dZ[2] = A[2]- y?

taoufiktalibi · August 17, 2023, 2:48pm

Hi
in the optional video explaining backpropagation of C1 of deep learning spec , when we use the whole training set in the X matrix , we should consider the overall cost formula that includes the term 1/m
but when professor calculate dL/dZ[2] = A[2]- y , we dont include 1/m
normaly dL/dZ[2] = 1/m * (A[2]- y ) ?

saifkhanengr · August 17, 2023, 3:08pm

Because then we added that part to dW and db. Chain rule. Please read this.

taoufiktalibi · August 17, 2023, 4:51pm

Still not clear for me ; i m sorry
here is my derivation of DJ/DZ using batch of m training examples

paulinpaloalto · August 17, 2023, 6:41pm

You gave the answer right there: notice that this is the derivative of L, not J. Of course L is a vector quantity with m elements. You don’t get the average until you get to the stage of computing derivatives of J, which is the average of L over the m samples. Literally the only quantities in any of this that are derivatives of J are dW and db. Everything else are just “Chain Rule” factors used to compute dW and db and are not averages.

taoufiktalibi · August 18, 2023, 9:27am

Thank you

Topic		Replies	Views
Course 1 - Week 4 - 1/m in backpropagation Neural Networks and Deep Learning coursera-platform	12	668	April 29, 2024
W3_Vectorization of dZ[2] equations Neural Networks and Deep Learning coursera-platform	5	559	March 31, 2023
Derivation of formula for dZ[2] Neural Networks and Deep Learning coursera-platform	2	592	May 19, 2023
Exercise 6 - backward_propagation in Programming Assignment Week 3 Neural Networks and Deep Learning coursera-platform	8	697	October 27, 2022
Week 3 - Backpropagation Intuition - gradient descent Neural Networks and Deep Learning coursera-platform	1	498	July 18, 2022

Optional video explaining backpropagation of C1 : dL/dZ[2] = A[2]- y?

Related topics