How does axis=-1 make sense in tf.keras.layers.Normalization?

snandi · November 30, 2022, 6:13pm

norm_l = tf.keras.layers.Normalization(axis=-1)

norm_l.adapt(X)  # learns mean, variance

Xn = norm_l(X)

In the piece of code above, X (training set) has a shape (200,2) because it has 200 training examples and 2 features. In the Coffee Roasting Lab, the tf.keras.layers.Normalization function uses axis=-1 which kind of does not make sense. I believe it should be axis=0 since we want to calculate the mean and variance of all the training examples for each feature. Can someone please explain how axis=-1 works out?

I tried working with axis=0 but I am getting an error

All `axis` values to be kept must have known shape. Got axis: (0,), input shape: [None, 2], with unknown axis at index: 0

TMosh · November 30, 2022, 6:27pm

When using Keras, “axis = -1” is shorthand for “use the last axis”. So it automatically adjusts for the shape of the data.

rmwkwok · December 1, 2022, 2:35am

Hello @snandi,

Welcome to the community!

I always suggest us to read the documentation when something is unclear Let me quote:

The axis or axes that should have a separate mean and variance for each index in the shape. For example, if shape is (None, 5) and axis=1 , the layer will track 5 separate mean and variance values for the last axis… Defaults to -1, where the last axis of the input is assumed to be a feature dimension and is normalized per index.

Cheers,
Raymond

snandi · December 1, 2022, 5:47pm

Thanks, Raymond
It makes sense now.

Can you please explain why we are getting the shape [None, 2] in the error message?

rmwkwok · December 1, 2022, 8:24pm

You are welcome @snandi.

Tensorflow reserves the 0th axis for carrying the meaning of number of samples, and in a way that it doesn’t really limit how many there are. Though your training set has 200 samples, Tensorflow allows the actual inputting number of samples be anything, making it flexible for predicting any number of test cases, or flexible for any mini-batch training sizes. The None there means any number in this context.

As for the 2, it is the shape of a single sample, which is the only thing - (2,) - you need to tell Tensowflow for the input_shape of the model.

Cheers,
Raymond

Topic		Replies	Views
What does mean -1 in Normalization? Advanced Learning Algorithms week-1	8	410	April 12, 2024
Normalization in keras Advanced Learning Algorithms week-1	3	553	December 3, 2022
C2_W1_Lab02_CoffeeRoasting_normalization Advanced Learning Algorithms week-1	4	50	April 3, 2025
What does keras normalize axis argument does? Advanced Learning Algorithms week-1	5	412	October 7, 2023
Why does axis behavior is changed in the TensorFlow reduce_mean and Normalization layer? Advanced Learning Algorithms week-2	1	477	January 31, 2023

How does axis=-1 make sense in tf.keras.layers.Normalization?

Related topics