Course1, Week2, programming Exercise 5 - Propagate

Hi, I figured out the code by the shape of matrix/array. But I’m not sure when to use

  1. np.dot (a, b)
  2. a * b

details:

  • in the “cost” code, I used a*b, which is element-wise calculation. then I used np.sum to add all Loss function together to be a Cost function
  • but in “dw” code, I need to use np.dot(a,b) to get matrix multiplication. why can’t be a * b? I’m fully understanding when to use what.

(I don’t think I can post my code here? let me know if there is a way to share my code)

Thanks ahead!

A dot product computes the sum of the products of the elements. So the shape of the result will be different than either operand.

The * operator computes the element-wise products, but doesn’t include the sum. The shape of the result will be the same as the shape of the two operands.

If the equation you’re implementing has the sum of two things that are multiplied, then look at the desired output shape.

1 Like

Here’s a thread which talks about the general question of when to use * versus np.dot.

That thread is linked from the DLS FAQ Thread, which is worth a look if it is new to you.

1 Like

Hi @chengchenggan,

In a matrix multiplication between matrix A (of i rows) and matrix B (of k columns), there are i \times k dot products. I believe you know how to implement one dot product with a * b and np.sum, so if you want to implement a matrix multiplication through implementing all of the dot products, it will be very complicated.

Cheers,
Raymond

appreciate all the quick replies! Looking at the dimensions make sense!

Hi @rmwkwok, I understand the first sentence. Is it possible to elaborate “so if you want to implement a matrix multiplication through implementing all of the dot products, it will be very complicated” ? Do you mean

  1. “i x k” matrix multiplication at a large dataset is already complicated itself
    or 2) after implemented, it will be hard to do np.sum?

Hey @chengchenggan,

Even with a small A and a small B like the following, it is complicated:

A = \begin{bmatrix} a_{00} & a_{01} & a_{02} \\ a_{10} & a_{11} & a_{12} \\ \end{bmatrix}

B = \begin{bmatrix} b_{00} & b_{01} \\ b_{10} & b_{11} \\ b_{20} & b_{21} \\ \end{bmatrix}

Don’t just think, but write down how you would compute A \times B with element-wise multiplication * and np.sum (You will have to loop through the rows in A and the columns in B).

Compare what you have written down to just np.dot(A, B) or just A @ B, then you see how complicated it is in terms of the effort to code.

It is not impossible, but it is complicated.

It is possible, but slow in terms of performance.

Cheers,
Raymond

1 Like