CONVnet BACKPROP

brabeem · July 26, 2021, 12:49pm

i Think we should average the derivatives of the cost w.r.t each training example by adding them and dividing by the number of training examples.
But in the implementation of backProp in CONVnet by scratch i.e WEEK 1 first assignment of sonvolutional neural network
they consider just adding all the derivatives with each training example (Without divding by the number of training examples) to be true.
can anyone explain the reason for not dividing ?
OR is cost not average cost of all training examples (just total) in case of CNN?

reinoudbosch · July 27, 2021, 5:16pm

Hi brabeem,

As far as I can see, the addition of the derivatives is concerned with calculating the gradients for a single training example. To understand this it may help to realize that backprop in a CNN entails a convolution of a filter with loss gradients, which involves addition over the loss gradients. This is explained here and here.

Topic		Replies	Views
Description of Backprop in Course 4 Week 1 Exercise Convolutional Neural Networks	1	501	February 6, 2022
Backpropagation in Convolutional Neural Networks - dW overall derivative Convolutional Neural Networks	2	549	August 24, 2022
Backpropagation when using dropout and Regularization Improving Deep Neural Networks: Hyperparameter tun	5	600	February 11, 2022
Backpropagation formulas Neural Networks and Deep Learning	7	1042	April 21, 2021
Why use `average` when vectorizing the backpropagation calculations(C1_W4, page17) Neural Networks and Deep Learning	3	373	August 17, 2023

CONVnet BACKPROP

Related topics