`axis=-1`

refers to the last axis. If we have an array of shape `(2,3,4)`

, axis=-1 refers to performing operations in the last axis, which has 4 elements in it.

When an nd-array has shape `(1, 10)`

leaving out `axis=-1`

doesn’t make much difference (except for shape) since `np.argmax`

across the entire array is the same as axis=-1.

```
>>> import numpy as np
>>> a = np.random.rand(1, 10)
>>> np.argmax(a)
4
>>> np.argmax(a, axis=-1)
array([4])
```

In deep learning, we make predictions on batches of data. So, if you have 32 rows with 10 features in each row, the `axis=-1`

parameter makes a difference:

```
>>> b = np.random.rand(32, 10)
>>> np.argmax(b)
192
>>> np.argmax(b, axis=-1)
array([4, 7, 9, 3, 3, 5, 2, 1, 6, 4, 2, 4, 9, 4, 1, 3, 3, 9, 9, 2, 7, 0,
8, 0, 4, 6, 5, 7, 3, 5, 4, 8])
```