Np.matmul vs np.dot

Hello

In this slide the code np.matmul(AT,w) was used for matrix multiplication
It seems that np.dot(AT,w) does the same thing. Is it essentially the same thing?

Thanks

Eric

1 Like

Hello @Eric5
matmul differs from dot in two ways.

  • Multiplication by scalars is not allowed.
  • Stacks of matrices are broadcast together as if the matrices were elements- means both behave differently in 3D or higher dimension
1 Like

hello @Eric5. Here is a Thread which I think can clear your doubt.

Hi Eric,

The other solutions are correct as well but I just wanted to go into a little more detail in case its helpful.

You are correct mathematically but only in a restricted setting: dot product is essentially the same as matrix multiplication, but only when you are multiplying two vectors (the dot product cannot multiply matrices).

In terms of syntax, there is a bit of a difference. np.dot expects two 1-dimensional numpy arrays. So you can compute, for example

v = np.array([1,2,3])
w = np.array([2,4,3])
np.dot(v,w)

and this will return 2+8+9 = 19. On the other hand, np.matmul expects two 2-dimensional numpy arrays. And they must be of compatible dimensions, so if A.shape() == (m,k) then B.shape() has to be equal to (k,n) for some n (i.e. the number of columns of A is the number of rows of B. So in order to use np.matmul to perform a dot product, you could define v as a row vector and w as a column vector. E.g. to repeat the above example we could write

v = np.array([[1,2,3]])
w = np.array([[2],[4],[3]])
np.matmul(v,w)

and this will give the same output as the previous example.

4 Likes

Thanks so much, this is now very clear

1 Like

That is true of the operation that mathematicians call “vector dot product”. But have a look at the documentation for numpy dot: it turns out it is more than just “vector dot product”. If the operands are 2D arrays, it is actually full matrix multiply in the style where the atomic operation is the vector dot product between one row of the first operand and one column of the second operand.

It also turns out that numpy matmul can handle 1D arrays. Watch this:

v = np.array([1, 2, 3])
w = np.array([2, 4, 3])
print(f"np.dot(v,w) = {np.dot(v,w)}")
print(f"np.matmul(v,w) = {np.matmul(v,w)}")
np.dot(v,w) = 19
np.matmul(v,w) = 19

They mention in the documentation that they supplement the dimensions in the 1D case to make that work.

So for our purposes here (dealing with 1D and 2D objects) it turns out that np.dot and np.matmul are equivalent. There are differences for higher dimensional objects, which Jenitta mentioned above, but we won’t encounter that case for a while.

Perhaps it is worth pointing out that there is a completely different multiplication operation: “elementwise” matrix multiplication. That is implemented by the numpy function np.multiply documented here. You can also use the “overloaded” operators if the operands are numpy arrays:

“*” is equivalent to np.multiply
“@” is equivalent to np.matmul

For elementwise multiplication, you also need to be aware of the numpy concept of “broadcasting”, which was discussed on the thread that Ritik linked above.

3 Likes

Thank you for pointing this out! I didn’t realize this about np.matmul. Very good to know.

I came here looking for this explanation as I saw no difference in the results in code when using both np.dot and np.matmul. Thank you.

I came here with a similar question and I want to ask if order matters in np.matmul because I see the input order in matmul differs from that of np.dot – but I know that order does not matter in np.dot. I became curious if the order difference of the inputs does matter as it pertains to np.matmul since order matters in matrix multiplication. Thus, I want to hear what is the consequential difference between the two such that order does matter in one and not the other but they provide the same result!

The case in which the inputs are 1D vectors is not the general case. Both np.matmul and np.dot are implementations of dot product style matrix multiplication. So they are equivalent and they are not commutative in general. Just because an operation is not commutative in general does not mean that you can’t find special cases in which it happens to be commutative. For example, suppose that A is an arbitrary 5 x 5 matrix and I is the 5 x 5 Identity matrix. Then the following is true:

A \cdot I = I \cdot A = A

But that does not imply that

A \cdot B = B \cdot A

for all 5 x 5 matrices A and B.