Element-wise multiplication or dot product in backpropagation

Louis_Louis_Etienne · December 27, 2023, 12:15pm

Hello there,

I am quite confused about when element-wise multiplication or matrix multiplication (dot product style) are used in the chain rule during backpropagation.

Let’s assume we have a shallow network with 2 inputs, one hidden layer with 3 units and one output unit. We will also use there the notation of Professor Ng → (A0, A1, A2) for activations, (Z1, Z2) for linear operations, (W1 and W2) for weights matrices and J for the cost.

So, using the chain rule during backpropagation to compute the following equation : \frac{\partial \textbf{J}}{\partial \textbf{W1}} = \frac{\partial \textbf{J}}{\partial \textbf{A2}}. \frac{\partial \textbf{A2}}{\partial \textbf{Z2}}.\frac{\partial \textbf{Z2}}{\partial \textbf{A1}}.\frac{\partial \textbf{A1}}{\partial \textbf{Z1}}.\frac{\partial \textbf{Z1}}{\partial \textbf{W1}}

when do we know if the operations in the chain rule are element-wise or dot product ?

I just ask the question because when I compute all these partial derivatives by hand and then look for shape matching between both side of the equation, it is not correct with only matrix multiplications.

paulinpaloalto · December 27, 2023, 4:06pm

The Chain Rule deals with the composition of functions, so how the derivatives are handled depends on what the functions are. In some cases they involve dot products (linear activation) and in some cases they are “elementwise” operations, e.g. the activation functions. So for example \frac {\partial A1}{\partial Z1} is just the derivative of the layer 1 activation function, which was applied elementwise.

This is beyond the scope of this course: Prof Ng does not really cover the underlying calculus. Here’s a thread with lots of links to supplementary material about the mathematics of back propagation.

Louis_Louis_Etienne · December 27, 2023, 11:49pm

Thank you for your precious reply. I’ll deal with all the ressources you gave

Topic		Replies	Views
How to choose between matrix multiplication and element wise multiplication during BackPropagation in Chain Rule? Neural Networks and Deep Learning	6	955	December 25, 2023
Queries regarding backpropagation in RNNS Sequence Models week-1	1	19	January 1, 2025
DL Specialisation_C1_W4 Neural Networks and Deep Learning	9	407	December 27, 2023
Dot product vs element-wise multiplication of arrays Supervised ML: Regression and Classification week-3	9	1876	September 29, 2022
The choice between using * (element-wise multiplication) and np.dot (dot product) Deep Learning Resources	1	285	January 2, 2024

Element-wise multiplication or dot product in backpropagation

Related topics