C4W1 CNN back propagation

yunxiangqin · October 12, 2021, 9:43pm

When calculating db by summing up dZ (optional assignment), why don’t we need to divide by number of training examples m?

jonaslalin · November 2, 2021, 9:19am

Probably because the code computes dJ/db and not dL/db as per lecture videos. I like the former notation better because it is a cleaner solution than to introduce dL/db and dL/dW. We have a recent topic where you can read more:

Topic		Replies	Views
The intuition of db^[l]=dz^[l] and da^[l-1]=w^[l-1].dz^[l] Neural Networks and Deep Learning coursera-platform	4	789	May 27, 2023
Week 3: wrong formula for the derivatives dZ[2] in videos and notebook Neural Networks and Deep Learning coursera-platform	4	786	August 20, 2022
Dividing by "m" in back propagation using vectorized implementation Neural Networks and Deep Learning week-module-3 , coursera-platform	3	462	February 19, 2024
Course 1 - Week 4 - 1/m in backpropagation Neural Networks and Deep Learning coursera-platform	12	668	April 29, 2024
Derivation of formula for dZ[2] Neural Networks and Deep Learning coursera-platform	2	592	May 19, 2023

C4W1 CNN back propagation

Related topics