Week 2 Assignment 2 Why do we set base_model(training=False)?

ken2022 · December 19, 2022, 9:37pm

In the second assignment of Week 2, we implemented the transfer learning from MobileNet. We import the MobileNet model by the following code

base_model = tf.keras.applications.MobileNetV2(input_shape=input_shape,
include_top=False, # <== Important!!!
weights=‘imagenet’) # From imageNet

Then we set

base_model.trainable = False

So we freeze all parameters in the base_model.

Why do we need to set base_model(x, training=False) in the next following code? The comment says that “set training to False to avoid keeping track of statistics in the batch norm layer”. I don’t understand the explanation. I set training=True and it also gives a decent result.

My guess is that setting training=False makes the base_model acting as an inference model. But we have already make base_model untrainable. Why do we need this extra step?

AbdElRhaman_Fakhry · December 19, 2022, 10:51pm

HI @ken2022

I thinks that the MobileNet has a batch normalization layer(like the upper photo) it is depending layer other than dense layer it isn’t depending layer so writing base_model(x, training=False) prevent changing the statistics(previous calculation of batch normalization(standardization)) to change…because the batch normalization layer change according to number of training set that you have.so it is keep the batch normalization layer statistics like what it trained before(number of training that the model run according to it before)…it is different from base_model.trainable = False
that make layers freeze …
I think this like will five you a good intuition about it python - What does training = False actually do for Tensorflow Transfer Learning? - Stack Overflow
Thanks!
Abdelrahman

Juan_Olano · December 19, 2022, 11:02pm

On a more general note related to Transfer Learning:

In transfer learning, we start with a pre-trained model, and then we may want to fine-tune the model on our own dataset. This can be done by unfreezing some of the layers of the pre-trained model or adding new layers to the pre-trained model, and training those layers while keeping the rest of the layers frozen.

Setting training = false when using a pre-trained model is typically done during the fine-tuning process. It indicates that we do not want to update the weights of the frozen layers, where the weights have already been trained.

By keeping the weights of the frozen layers fixed, we allow the model to learn from the new dataset using only the unfrozen layers. This allows us to take advantage of the knowledge and features learned by the pre-trained model and apply them to our new dataset.

ken2022 · December 20, 2022, 2:59am

thank you for your reply. So when we set base_model.trainable=False, we only freeze trainable parameters. BatchNormalization will still record its moving average if we don’t set training=False for the BatchNormalization. Is my understanding right?

AbdElRhaman_Fakhry · December 20, 2022, 5:55am

yes you are understanding right because batch normalization layer is depending layer like what I said before

Thanks!
Abdelrahman

Topic		Replies	Views
C2_W2_Transfer_Learning Convolutional Neural Networks coursera-platform	2	551	December 13, 2021
Why specify twice that the base model is not trainable? Convolutional Neural Networks coursera-platform	7	677	July 31, 2022
Assignment-2 week-2, batch normalization layer Convolutional Neural Networks coursera-platform	3	567	December 25, 2021
Questions Week2 Assignment2 "Transfer Learning" Convolutional Neural Networks coursera-platform	3	527	February 14, 2023
Course 4 Week 2: Programming Assignment ALPACA Convolutional Neural Networks coursera-platform	1	541	August 29, 2022

Week 2 Assignment 2 Why do we set base_model(training=False)?

Related topics