Pre-trained model- What is parameter 'weights=None'

Aroonima · May 6, 2022, 11:58am

Hi,

I am not able to understand why the ‘weights = None’ for using the pre-trained model as base model.
In an example from the documentation it has been specified as below:

base_model = tf.keras.applications.MobileNetV2(input_shape=IMG_SHAPE, include_top=False, weights=‘imagenet’)

Where as, in the exercise book -‘Apply transfer learning to Cats vs Dogs’, it is specified as weights=None.
pre_trained_model = InceptionV3(input_shape = (150, 150, 3),
include_top = False,
weights = None)

Seeking more clarity for this difference.
Thanks

balaji.ambresh · May 6, 2022, 1:07pm

When you specify weights=None, the weights learnt from imagenet is not used for initalizing the model i.e. only the architecture is used.

It’s possible to load weights using the load_weights method after the model has been created.
This could be done for a number of reasons like:

When using coursera platform to run code, they might not want to download the weights for every student submission. Having a shared location for weights saves bandwidth.
It’s possible that the author of this notebook took the base weights and tuned them for a few epochs and then saved it. When you make use of these new weights, there’s less training to be done.

bisht · May 6, 2022, 1:19pm

Hello,

You can use 3 option:
1 - The weights will be initialized randomly if weights = None
2 - Pre-training on ImageNet if weights = ‘imagenet’
3 - Or the path to the weights file to be loaded
Default to ‘imagenet’
layer.trainable = False because
In a NN, parameters that don’t compute gradients are usually called frozen parameters.
It is useful to “freeze” part of your model if you know in advance that you
won’t need the gradients of those parameters (this offers some performance benefits by reducing autograd computations).This is known as finetuning.
In finetuning, we freeze most of the model and typically only modify the classifier layers to make predictions on new labels.
You can see in the notebook,only parameters of last_layer updated else set to layer.trainable = False.

Aroonima · May 6, 2022, 1:46pm

Thanks for the reply @balaji.ambresh and @bisht
Fine, now its clear why ‘weights=None’. We are using the architecture without the weights(no transfer learning) and when we specify weights=‘imagenet’ , we are using the weights trained on imagenet data.

Now in the exercise book, we have used
local_weights_file = ‘/tmp/inception_v3_weights_tf_dim_ordering_tf_kernels_notop.h5’. So this could be weights from a custom trained model for the purpose of learning in Coursera platform as specified by @balaji.ambresh.

Aroonima · May 6, 2022, 2:01pm

I have more queries in this regard.

In the exercise, -‘Apply transfer learning to Cats vs Dogs’, we are freezing the layers from the pre-trained model and finetuning the model.

We have mentioned ‘include_top= False’, which means we are not considering the fully connected dense layer in the pretrained model.

Now, in the exercise book we have fined-tuned this pretrained model with our data, by adding a Flatten layer, a Fully connected Dense layer and an output layer.

But, in some other examples , as also mentioned in the documentation, the finetuning was done by a adding a Global average pooling layer and an out layer. So why is a Fully connected dense layer not added here. What is the difference in these 2 approaches?

balaji.ambresh · May 6, 2022, 2:24pm

When we transfer a model we have the following options:

Pick weights.
Pick the subset of layers.
Tune the number of layers to train after deciding on the number of layers and the initial weights.
Extend the base model.

Choice of model architecture is problem specific and requires a lot of experimentation (guided by metrics on train / validation datasets). There is no one correct answer for all problems.

Aroonima · May 6, 2022, 3:05pm

Thanks @balaji.ambresh

Topic		Replies	Views
Model with weights for the last layer - incompatible when include_last=False Convolutional Neural Networks week-2 , coursera-platform	7	18	October 8, 2024
Week 2 Assignment 2 Why do we set base_model(training=False)? Convolutional Neural Networks coursera-platform	4	741	December 20, 2022
Why specify twice that the base model is not trainable? Convolutional Neural Networks coursera-platform	7	672	July 31, 2022
Transfer_learning_with_MobileNet_v1: The alpaca model Convolutional Neural Networks coursera-platform	4	760	August 11, 2021
Week 2 Assignment 2 Exercise 2 Alpaca Model Convolutional Neural Networks coursera-platform	8	810	August 1, 2021

Pre-trained model- What is parameter 'weights=None'

Related topics