I am trying to implement the normalization using numpy and tensorflow normalization (first seen this layer in C2_W1_Lab03_CoffeeRoasting_Numpy assignment)

```
tf.random.set_seed(0)
x = tf.random.uniform((32, 5))
n = tf.keras.layers.Normalization(axis=0)
n.adapt(x)
n.mean.shape # this gives TensorShape([32, 1])
tf.reduce_mean(x, axis=0).shape # this gives TensorShape([5])
```

Hello @tbhaxor,

Great work trying it out yourself, and listing the shapes! `tf.keras.layers.Normalization`

is implemented to work that way, and as the documentation says:

Integer, tuple of integers, or None. The axis or axes that should have a separate mean and variance for each index in the shape. For example, if shape is `(None, 5)`

and `axis=1`

, the layer will track 5 separate mean and variance values for the last axis.

Setting `axis=0`

in `Normalization`

lets us normalize the zeroth axis, and to do so, we need the mean value by averaging over the first axis.

Setting `axis=0`

in `reduce_mean`

lets us compute the means by reducing (averaging over) the zeroth axis.

They sound reasonable, but as for why the decision is so made, we will really need to ask the developersâ€¦

Cheers,

Raymond