I read the post by AshutoshSahu and also the answers given for that but still i am confused. In that post i followed the answer given by raymond but didn’t get the last part.

suppose we have a numpy array of shape (2,2,3). if we put axis=-1 then is it taking mean and variance of [1,2,3],[2,3,4],[5,4,3],[3,2,1] separately and then calculating the final values?

This array has 3 axis: 0, 1, and 2. So axis=-1 would be doing the operation over the axis 2. If we do a sum over the axis 2 (or axis = -1) then we would convert the resulting array into a 2-dim array with values equal to the operation over the values of 2.

For instance:

ary.sum(axis=-1) = ary.sum(axis=2) =

[[ 6 9]
[12 6]]

See how we arrived to a shape=(2,2) from a shape=(2,2,3) because we ‘consolidated’ everything on axis=-1 (the axis with index 2 starting from 0).

Same principle would apply to other operations like mean.

tell me if am right.
the normalization process is basically calculating the mean and variance for each column because each column represent a features and that is why we have given axis = -1 because we want the mean variance values for each feature.

Normalization is scaling the input variables so that they have similar ranges of values. We don’t want, for instance, some variables with values under 100 and some other values with values over 100,000. With normalization we, well, normalize these inputs to prevent one variable from dominating the others.

One way to achieve normalization is by subtracting the mean of each variable and dividing by the standard deviation.