In the week 2 Transfer Learning assignment I understand the logic behind freezing the earlier layers and unfreezing the last “X” layers such that the network can pick up on your domains intricacies.
What I’m not really understanding is the second part of this next statement that is found in the Transfer Learning notebook saying “Set training in base_model
to False to avoid keeping track of statistics in the batch norm layer”.
Having training in ‘base_model’ set to False avoids changing the weights and only trains the new layers which makes sense. It’s the last part around “keeping track of statistics in the batch norm layer” where I am lost.
Any help in understanding this would be greatly appreciated!