Questions about transfer learning in "Transfer_learning_with_MobileNet_v1"

Hi guys,

I already completed the hw of Transfer_learning_with_MobileNet_v1. But I still have some questions. Thanks for your time in advance.

My understanding: to use transfer learning, I download a premade NN work (assume 10 layers total), I just modify the last 2 layers (fully connected layers) for my new target. This also means, I only train the last 2 modified layers. The original first 8 layers are “frozen”. If my understanding is correct. My questions are below:

Q1. In the section of Exercise 2 - alpaca_model, since we have

def alpaca_model(image_shape=IMG_SIZE, data_augmentation=data_augmenter()):
    
    input_shape = image_shape + (3,)
    
    ### START CODE HERE
    
    base_model = tf.keras.applications.MobileNetV2(input_shape=,
                                                   include_top=, # <== Important!!!!
                                                   weights='') # From imageNet
    
    # freeze the base model by making it non trainable
    base_model.trainable = False
    # create the input layer (Same as the imageNetv2 input size)
    inputs = Input...
    # apply data augmentation to the inputs
    x =  augmentation....
    # data preprocessing using the same weights the model was trained on
    x = preprocess_input...
    # set training to False to avoid keeping track of statistics in the batch norm layer
    x =   base_model...
    # add the new Binary classification layers
    # use global avg pooling to summarize the info in each channel
    x = GlobalAveragePooling2D...
    # include dropout with probability of 0.2 to avoid overfitting
    x =  Dropout  ...    
    # use a prediction layer with one neuron (as a binary classifier only needs one)
    outputs = Dense...
    
    ### END CODE HERE
    model = tf.keras.Model(inputs, outputs)
    return model

Then we use this model to train

initial_epochs = 5
history = model2.fit(train_dataset, validation_data=validation_dataset, epochs=initial_epochs)

My question is, you already have base_model.trainable = False. What are we training here? My understanding here is, this whole NN is NOT trainable, even the newly added layers. Is this right or wrong?

After this section, we start to do Fine-tuning. My understanding here is, I need to unfreeze only the newly added fully connected layers, and train them only (layers before those newly added layers are still frozen) .
I want to double-check my understanding of the code below:

base_model = model2.layers[4]
base_model.trainable = True

Q2. what’s meaning of trainable = True for layers[4] ?

# Freeze all the layers before the `fine_tune_at` layer
for layer in base_model.layers[:fine_tune_at]:
    layer.trainable = True

Q3. Freeze or un freeze layers ? The running index is layer, and it runs from 1st to fine_tune_at, and the trainable = True. my understanding of this code is, un-freeze the layers from 1st to fine_tune_at layers. But I need to un freeze the layers after fine_tune_at, right?

I do aware I need to improve my python skills :frowning: Thanks for your time again.

Note, I have deleted my previous reply.

Hi @sunson29 ,

I just replied to another learner about a similar question. I’d like to share that answer in hopes that it will shed light to your question:

Juan

Hi Juan,

Thanks for the link. You explained it very well !!! I saved as my note :slight_smile: I think my understanding was correct.

However, if I may, could you explain more on the side of the code of my question? I wish to 100% synchronize my understanding with the code. for example, in Q1, if trainable = False, what are we training later from model.fit? In Q2, Which the layers are frozen after all?

Juan, could you check my questions again? thank you for your time!

Hello @sunson29,

Let’s examine your quoted code while discussing your question:

This is the answer for your first question. When you define alpaca_model, you added this very Dense layer which is by default trainable. Although the base_model is not trainable, the Dense layer IS trainable. Therefore:

The Dense layer.

Wrong.

You don’t need to unfreeze the Dense layer because it is trainable by default.

This is how you can set a layer to be trainable. However, you need to make sure 4 is the right layer index you want to modify. Also, after you get the correct layer index, before assigning True to its trainable attribute, I suggest you to print the value of it to see that it is by default set to True already.

I think setting the right value for layer.trainable is your assignment ;), what is the correct value of it if your objective is to Freeze all the layers before the fine_tune_at layer ?

This is not about the understanding of the code. This is about the understanding of the requirement of the exercise. I strongly suggest you to read again the exercise 3 description right right right before the code cell of Exercise 3. It will answer your question!

Cheers,
Raymond

Hi @rmwkwok

Sorry about this late reply. I am very busy with my work, especially at end of the year.

For the Q1, thanks for the reply in detail. Yes, I understand now.
For my Q2, no. I am still stuck here. Allow me to demonstrate my question here:

step 1: unfreeze the base model

base_model = model2.layers[4]
base_model.trainable = True

we pick this layer 4, which is the original loaded network. Then unfreeze it. This means, all layers in base_model ( or model2.layers[4]) are trainable. I checked the code. Yes, I agree and I do see all layers[4] trainable=True. after the code above.

Step 2: However, the next step: set a layer to fine-tune from, then re-freeze all the layers before it. I think the code is not doing what it described.

# Fine-tune from this layer onwards
fine_tune_at = 120
# Freeze all the layers before the `fine_tune_at` layer
for layer in base_model.layers[:fine_tune_at]:
    layer.trainable = True

I ran the code above, and checked those layers.trainable in base_model. All I see is that they are ALL trainable. I think the True here should be the False. So that, the layers [0 to 119].trainable will be == False (re-freeze all the layers before it).
In other words, if I do the code as below, I saw layers from [0 to 119] are freeze (trainable = False), 120 to end are trainable = True

for layer in base_model.layers[:fine_tune_at]:
    layer.trainable = True

@rmwkwok could you enlighten me again, like the way you do before, step by step. Thank you

I agree. Please note that initially the assignment code is like this

for layer in base_model.layers[:fine_tune_at]:
    layer.trainable = None # <-- it is a None initially

so it is your exercise work to replace None with a correct answer, and I agree with your argument on what None should be replaced with.

I think you have pretty much explained everything yourself :wink:

Raymond

1 Like

hey @rmwkwok omg… I think I tricked myself after all. I must accidentally change the layer.trainable = None to layer.trainable = True, then I believed this is given the initial code… Thank you for your help!

@rmwkwok one more question here please, but very simple.

base_model = tf.keras.applications.MobileNetV2(input_shape=IMG_SHAPE,
                                               include_top=True,
                                               weights='imagenet')

In the code above, if include_top is False, the last 2 layers (in this MobileNetV2 case) will NOT be included and I understand that. My question is, which code I can use to check the layers whether belong to top or not ? I am expecting to see some code like:

base_model.layers[layerNumber].thisistop = True or False

Because if the premade NN is changed, I need to know how many last layers are considered as “top”. I do know I could just do a comparison between true and false, but I hope that if there is a simple code to do that. Thank you

No worries @sunson29 :wink:

Unfortunately I don’t think there is such an attribute to find out if it is part of the “top” or not. Indeed, comparing True and False as you said is one way, while what I have done before replying to you is that:

  1. go to the doc
  2. “View source on Github”
  3. Found this part which actually reacts to “include_top”

There is no simpler way :wink:

Cheers,
Raymond

1 Like

got it. thanks again!

You are welcome @sunson29 !

1 Like