Hello

In this slide the code np.matmul(AT,w) was used for matrix multiplication

It seems that np.dot(AT,w) does the same thing. Is it essentially the same thing?

Thanks

Eric

Hello

In this slide the code np.matmul(AT,w) was used for matrix multiplication

It seems that np.dot(AT,w) does the same thing. Is it essentially the same thing?

Thanks

Eric

1 Like

Hello @Eric5

`matmul`

differs from `dot`

in two ways.

- Multiplication by scalars is not allowed.
- Stacks of matrices are broadcast together as if the matrices were elements- means
**both behave differently in 3D or higher dimension**

1 Like

Hi Eric,

The other solutions are correct as well but I just wanted to go into a little more detail in case its helpful.

You are correct mathematically but only in a restricted setting: dot product is *essentially* the same as matrix multiplication, but only when you are multiplying two vectors (the dot product cannot multiply matrices).

In terms of syntax, there is a bit of a difference. `np.dot`

expects two 1-dimensional numpy arrays. So you can compute, for example

```
v = np.array([1,2,3])
w = np.array([2,4,3])
np.dot(v,w)
```

and this will return `2+8+9 = 19`

. On the other hand, `np.matmul`

expects two 2-dimensional numpy arrays. And they must be of compatible dimensions, so if `A.shape() == (m,k)`

then `B.shape()`

has to be equal to `(k,n)`

for some `n`

(i.e. the number of columns of `A`

is the number of rows of `B`

. So in order to use `np.matmul`

to perform a dot product, you could define `v`

as a row vector and `w`

as a column vector. E.g. to repeat the above example we could write

```
v = np.array([[1,2,3]])
w = np.array([[2],[4],[3]])
np.matmul(v,w)
```

and this will give the same output as the previous example.

4 Likes

Thanks so much, this is now very clear

1 Like

That is true of the operation that mathematicians call â€śvector dot productâ€ť. But have a look at the documentation for numpy dot: it turns out it is more than just â€śvector dot productâ€ť. If the operands are 2D arrays, it is actually full matrix multiply in the style where the atomic operation is the vector dot product between one row of the first operand and one column of the second operand.

It also turns out that numpy `matmul`

can handle 1D arrays. Watch this:

```
v = np.array([1, 2, 3])
w = np.array([2, 4, 3])
print(f"np.dot(v,w) = {np.dot(v,w)}")
print(f"np.matmul(v,w) = {np.matmul(v,w)}")
np.dot(v,w) = 19
np.matmul(v,w) = 19
```

They mention in the documentation that they supplement the dimensions in the 1D case to make that work.

So for our purposes here (dealing with 1D and 2D objects) it turns out that `np.dot`

and `np.matmul`

are equivalent. There are differences for higher dimensional objects, which Jenitta mentioned above, but we wonâ€™t encounter that case for a while.

Perhaps it is worth pointing out that there is a completely different multiplication operation: â€śelementwiseâ€ť matrix multiplication. That is implemented by the numpy function `np.multiply`

documented here. You can also use the â€śoverloadedâ€ť operators if the operands are numpy arrays:

â€ś*â€ť is equivalent to `np.multiply`

â€ś@â€ť is equivalent to `np.matmul`

For elementwise multiplication, you also need to be aware of the numpy concept of â€śbroadcastingâ€ť, which was discussed on the thread that Ritik linked above.

3 Likes

Thank you for pointing this out! I didnâ€™t realize this about `np.matmul`

. Very good to know.

I came here looking for this explanation as I saw no difference in the results in code when using both np.dot and np.matmul. Thank you.

I came here with a similar question and I want to ask if order matters in np.matmul because I see the input order in matmul differs from that of np.dot â€“ but I know that order does not matter in np.dot. I became curious if the order difference of the inputs does matter as it pertains to np.matmul since order matters in matrix multiplication. Thus, I want to hear what is the consequential difference between the two such that order does matter in one and not the other but they provide the same result!

The case in which the inputs are 1D vectors is not the general case. Both `np.matmul`

and `np.dot`

are implementations of dot product style matrix multiplication. So they are equivalent and they are *not* commutative in general. Just because an operation is not commutative in general does not mean that you canâ€™t find special cases in which it happens to be commutative. For example, suppose that A is an arbitrary 5 x 5 matrix and I is the 5 x 5 Identity matrix. Then the following is true:

A \cdot I = I \cdot A = A

But that does **not** imply that

A \cdot B = B \cdot A

for all 5 x 5 matrices A and B.