What is meant by "When freezing layers avoid keeping track of statistics (like in the batch normalization layer)"?

Hi, Daniele.

The key point to realize here is that Batch Norm is also trainable, but that is separate from whether the usual “parameters” of the various layers are trainable. So what they are saying is to make sure to do two things:

  1. Freeze the parameters of the layer (weight and bias values).
  2. Disable the training of the batch normalization constants as well (those are the “statistics” they are referring to, since BN is based on the mean and variance of the data in each minibatch).

The mechanisms you use to control those two types of training are different. All this is not very thoroughly explained in the course materials, but the Keras documentation is pretty complete if you can find the relevant sections. Here’s a thread that has pointers into the Keras documentation about Transfer Learning.