When we want to do fine-tuning, do we unfreeze all the layers or some like we did on the PA?

Marios_Constantinou · July 24, 2022, 12:39pm

I have a ResNet50 model with 190 layers

Do I unfreeze the whole model and run it for like 5 epochs with a very low learning rate? Or do I unfreeze some of them? What is a good practice for transfer learning when you have limited computational power and you want to find-tune a relatively large model?

Elemento · July 24, 2022, 3:55pm

Hey @Marios_Constantinou,
If you have a relatively limited computational power, unfreezing the entire model and training it for 5 epochs might not be feasible, since there are 23+ million parameters. On the other hand, unfreezing a few of the final layers, and training the model afterwards, might be more practical. Not only more practical, unfreezing only a few of the final layers and training the model afterwards is something that has been used popularly under the umbrella of fine-tuning.

However, I guess the major deciding factor for this choice is the size of the dataset. If you have a comparatively larger dataset, you may choose to unfreeze all the layers and train all of them for a few epochs (more computation required), but if you have only a small dataset like a few hundred examples or a few thousand examples, then unfreezing only a few of the final layers and training them is the way to go (less computation required). Let me know if this helps.

Cheers,
Elemento

Marios_Constantinou · July 24, 2022, 7:44pm

My dataset consists of ~30k images, I don’t think thats considered large by any means. The thing is, I am training on my CPU, I don’t have an Nvidia GPU, so each epoch with frozen layers takes around 9 minutes. If I unfreeze the whole model, it takes around 36 minutes, for 5 epochs that’s 3 extra hours. I don’t know if I got used to training on a CPU but 3 hours is okay for me, since I can do something else during the day. ( I did check google colab but I got mixed results in terms of computational performance)

I guess it will get worse when I try out bigger models, because I will train a few and compare them so I might unfreeze the entire model here and for the bigger models, unfreeze some of the final layers.

Thank you for your reply!

Elemento · July 25, 2022, 12:20pm

Hey @Marios_Constantinou,

Indeed your computation time will extend when you unfreeze the entire model (in the case of larger models), and the quoted method is one way to go. Another way to go could be to unfreeze only some of the final layers in each of your models under your comparison study. That way, in my opinion, you would get a much more fair comparison of all your models, and then perhaps, you can try to unfreeze all the layers of the model which only appears to be best after the first round, and train it. I hope this helps.

Cheers,
Elemento

Marios_Constantinou · July 25, 2022, 12:33pm

Yeah you are right, it wouldn’t be fair if I unfreeze the whole ResNet50 model and then unfreeze only 20 layers in DenseNet 121. Thank you

Marios_Constantinou · July 25, 2022, 1:18pm

One last question that just came to me. Since I will unfreeze some layers. I need to pick the correct layer to unfreeze from right? In the case of ResNet50, we have conv blocks, so I can’t just put a random number to unfreeze from, I have to find where does one block ends and the other starts?

So from base_model.summary() I found where conv4 blocks end and where conv5 starts but since its 190 layers I had to find it’s index. This is my code:

# Find layer number you want to fine-tune from
index = None
for idx, layer in enumerate(base_model.layers):
    if layer.name == 'conv4_block6_out':
        index = idx
        break
print(idx)

Output: 153

# Make sure it's the correct index
print(base_model.layers[153])

Output: <keras.layers.merging.add.Add object at 0x0000012550261C90>

So I thought I would unfreeze conv5 which starts from index 154. So my code to freeze everything else should be:

fine_tune_from = 154
for layer in base_model.layers[:fine_tune_from]:
    print('Layer ' + layer.name + ' frozen.')
    layer.trainable = False

And since I did base_model.trainable = True, layers from 154 until the end should be unfrozen right?

Sorry for the long text <3

Elemento · July 25, 2022, 1:43pm

Hey @Marios_Constantinou,
The code seems to be correct to me, and seems to be doing exactly what you want it to do. I hope the results are also as per your hopes.

Cheers,
Elemento

Marios_Constantinou · July 25, 2022, 1:44pm

Thank you, I will probably put you on my thesis credits haha. Have a lovely week!

Elemento · July 25, 2022, 1:45pm

I am glad I could help, and wishing you a lovely week too

Cheers,
Elemento

Topic		Replies	Views
Transfer_learning_with_MobileNet_v1 misunderstanding Convolutional Neural Networks coursera-platform	4	513	April 12, 2023
Transfer learning assignment week 2 Convolutional Neural Networks coursera-platform	3	502	March 27, 2023
C4 general question about deciding which layers to retrain Convolutional Neural Networks coursera-platform	6	541	March 2, 2022
MobileNet and Transfer Learning: Epochs for "Frozen" layers and "fine tuning" Convolutional Neural Networks coursera-platform	1	507	January 8, 2022
[Week 2] [Transfer learning] Is fine-tuning really working? Convolutional Neural Networks coursera-platform	11	1268	January 3, 2022

When we want to do fine-tuning, do we unfreeze all the layers or some like we did on the PA?

Related topics