EfficientNetB4 model has a rescaling layer followed by a Normalization layer in the beginning. From the official documentation of `keras.layers.Normalization`

I inferred that this layer will shift and scale inputs into a distribution centered around 0 with standard deviation 1. It accomplishes this by “precomputing” the mean and variance of the data, and calling `(input - mean) / sqrt(var)`

at runtime.

I wonder how does it “precompute” mean and std of the input dataset. If I use `keras.preprocessing.image.ImageDataGenerator`

then how and when does the precomputation of mean and std happen?

`ImageDataGenerator`

is an iterator of the underlying images. Pixel values are still floating point numbers. So, computations for calculating mean and variance can continue to happen at the batch level during the training process.

Okay. If the training set consists of let’s say 6400 images, and we use a batch size of 64. While training, for each batch, 64 images are read and their mean and standard deviation is computed. The images of that batch are then normalized using these mean and std and the training proceeds further. Am I right?

If this is so, then wouldn’t the mean and std be different for each batch? Also, what values of mean and std will we use to normalize the images in the test set?

Please read this where it’s mentioned that you either call `adapt`

or manually supply the mean and variance before invoking the `fit`

method when using `keras.layers.Normalization`

.

There’s one more layer called BatchNormalization which learns mean and variance of the entire dataset on the fly. All you need to do is place this layer within the keras model and let it learn during model training.

Mean and variance should be learnt only from the training dataset. Nothing new should be learnt from the test dataset since it is ONLY meant to evaluate model performance.