What does mean -1 in Normalization?

khaldon · March 30, 2024, 7:55pm

this code to create instance of normaliztion in tensorflow. so what is the axis parameter -1. I found in tensorflow documentation says that means feature dimension. so what is dimension means. should be 1 to reference that we normalize the columns ?

norm_l = tf.keras.layers.Normalization(axis=-1)

Nevermnd · March 30, 2024, 8:02pm

@khaldon See if this post helps you:

tarunsaxena1000 · April 4, 2024, 12:35pm

HI, 10 days ago i was stuck in this same problem for 2 days, let me share what i found.

In NumPy and TensorFlow, the axis parameter is used to specify along which axis an operation should be performed. When you specify axis=-1, it indicates that the operation should be applied along the last axis of the array.

In a 2D array, the last axis (axis -1) corresponds to the columns i.e axis=1. This convention makes sense when considering the typical use cases for operations like normalization and summing:

For normalization in tf, it’s common to normalize features (columns) in datasets. Each column represents a different feature, and it’s typical to want to normalize each feature independently.(i.e vertically)
For summing np.sum, when you specify axis=-1, it’s useful to sum across rows. This is often needed when, for example, you want to compute row-wise sums in matrices representing data points or observations. (i.e horizontally)

Therefore, the behavior of axis=-1 being interpreted as operating along the columns for normalization and along the rows for summing aligns with the typical use cases and conventions in data analysis and machine learning.

While this convention may seem counterintuitive at first glance, it’s consistent with the way arrays are indexed and used in numerical computing libraries like NumPy and TensorFlow. Once you become familiar with this convention, it becomes easier to understand and work with array operations in these libraries. the cause may be because the tf and numpy were initially developed independently.

I am not convinced with this answer, I have just made peace with it, if you find a better explanation please let me know.

tarunsaxena1000 · April 4, 2024, 12:40pm

@TMosh @rmwkwok @Deepti_Prasad can you guys verify my above answer please.

TMosh · April 4, 2024, 4:22pm

I cannot say, as I have never investigated this in detail.

rmwkwok · April 5, 2024, 12:30am

Hello @tarunsaxena1000,

I agree with most of your comments, but for the sake of discussion, let me share my version:

Consider a 2D array x that has the row axis and the column axis.

For most array operations, we specify which axis to do away with. For np.sum(x, axis=-1), we do away with the last axis by summing them up, leaving only the row axis and thus calling it a row-wise operation.

Normalizations are exceptions - that we specify for which axis we want to create and keep the normalization constants. For `tf.keras.layers.Normalization(axis=-1)(x), we keep constants for the column axis and thus call it a column-wise operation.

When thinking about what kind of operation it is:

For sum, we do away with the columns and keep the rows. We call it a row-wise operation.
For normalization, we keep the constants of the columns. We call it a column-wise operation.

When thinking about what to specify for axis:

for most array operations, it is about what to do away with.
for normalizations (and batchnormalization), it is about what to keep.

The above comment applies to “my convention” too - for example, with a 10-D x, for np.sum(x, axis= (3,4,5) ), we will do away with those 3 axes and leave only axes 0, 1, 2, 6, 7, 8, 9 before they are re-numbered in order.

Nice explanation, @tarunsaxena1000, and it is always good to read more different views

Cheers,
Raymond

rmwkwok · April 5, 2024, 1:12am

Btw, @tarunsaxena1000, my “do away with” description makes even more sense if you check out the numpy documentation for sum or other similar operations, because you will see a parameter called “keep_dim=False” which carries the meaning that, if you don’t keep dim, they are completely done away with, otherwise, you still get to keep the dim (but not the values because they have been aggregated).

tarunsaxena1000 · April 11, 2024, 3:58am

@rmwkwok Thanks for this detailed explanation.

rmwkwok · April 12, 2024, 12:07am

You are welcome, @tarunsaxena1000!

Topic		Replies	Views
C2_W1_Lab02_CoffeeRoasting_normalization Advanced Learning Algorithms week-module-1	4	93	April 3, 2025
Normalization(axis=-1) Advanced Learning Algorithms week-module-1	4	554	January 19, 2023
How does axis=-1 make sense in tf.keras.layers.Normalization? Advanced Learning Algorithms week-module-1	4	1009	December 1, 2022
Normalization in keras Advanced Learning Algorithms week-module-1	3	564	December 3, 2022
Understanding error because of axis parameter in Normalization Advanced Learning Algorithms week-module-1	5	669	August 12, 2023

What does mean -1 in Normalization?

Related topics