In the week 2 Transfer Learning assignment I understand the logic behind freezing the earlier layers and unfreezing the last “X” layers such that the network can pick up on your domains intricacies.

What I’m not really understanding is the second part of this next statement that is found in the Transfer Learning notebook saying “Set training in base_model to False to avoid keeping track of statistics in the batch norm layer”.

Having training in ‘base_model’ set to False avoids changing the weights and only trains the new layers which makes sense. It’s the last part around “keeping track of statistics in the batch norm layer” where I am lost.

Any help in understanding this would be greatly appreciated!

1 Like

The training argument to BatchNorm is a different thing that setting the overall model to be not trainable. Take a look at the docs for Keras BatchNormalization. We also saw that argument being used more explicitly in the previous assignment about Residual Networks.

Okay, so the statistics that are being referenced are the mean and variance used to normalize the current batch. Thank you for that link, that was exactly what I was looking for!