Why dA[L] has no 1/m in week 4?

HamedGholami · July 12, 2021, 7:50am

Hi and thanks for reading my problem.
in week 4 Andrew computes the derivative of loss with respect to the last layer output(A[L]) but his formula in vectorized implementation has no 1/m. does anyone know why he did not write 1/m in the derivative?
thanks again.

kenb · July 12, 2021, 1:33pm

Hi @HamedGholami; welcome to the DLS specialization. Are you referring to a video lecture, a notebook assignment, or both?

HamedGholami · July 13, 2021, 2:43am

Hi @kenb; thank you for the kind words.
I’m referring to the 6th lecture and you can see dA in 8:12 written in the bottom right corner.
thanks for your help.

HamedGholami · July 13, 2021, 3:08am

I understood it. it is because for every training example we are calculating derivatives separately.
thanks

paulinpaloalto · July 14, 2021, 12:59am

Yes, exactly! When you see L, that is the vector valued loss function with one value per sample. The average doesn’t come into the picture until you start taking derivatives of J, which is the average of L across the samples. The dAL value is just one of the Chain Rule factors you need to compute the actual gradients of J w.r.t. the various parameters.

HamedGholami · July 19, 2021, 5:44pm

thank you for your great answer.

Topic		Replies	Views
Week 3,4: Why isn't 1/m part of dz^[L]? Neural Networks and Deep Learning coursera-platform	19	1325	December 6, 2022
Delta Loss question Neural Networks and Deep Learning coursera-platform	2	566	May 18, 2022
Possible typo (missing 1/m) Neural Networks and Deep Learning coursera-platform	3	605	August 21, 2022
Week 3 - Backpropagation Intuition - gradient descent Neural Networks and Deep Learning coursera-platform	1	507	July 18, 2022
Backpropagation formulas Neural Networks and Deep Learning coursera-platform	7	1084	April 21, 2021

Why dA[L] has no 1/m in week 4?

Related topics