Hello all,

This is more of a conceptual question on multiplying matrix. When we are doing gradient descent when building a NN, a series of matrix multiplication needed to be performed as shown in the slides.

For example, the calculation in the following:

I will type in the code

dW2 = (1/m) * np.dot(dZ2, A1.T)

I used np.dot because I can see the shapes require them to multiply this way instead of an element wise multiplication. However conceptually why do we need to use np.dot and not an element wise product?

In other words I am not always clear on when we need to multiply two matrix A and B. When do we do A*B and when do we do np.dot(A,B)?