Do I unfreeze the whole model and run it for like 5 epochs with a very low learning rate? Or do I unfreeze some of them? What is a good practice for transfer learning when you have limited computational power and you want to find-tune a relatively large model?
Hey @Marios_Constantinou,
If you have a relatively limited computational power, unfreezing the entire model and training it for 5 epochs might not be feasible, since there are 23+ million parameters. On the other hand, unfreezing a few of the final layers, and training the model afterwards, might be more practical. Not only more practical, unfreezing only a few of the final layers and training the model afterwards is something that has been used popularly under the umbrella of fine-tuning.
However, I guess the major deciding factor for this choice is the size of the dataset. If you have a comparatively larger dataset, you may choose to unfreeze all the layers and train all of them for a few epochs (more computation required), but if you have only a small dataset like a few hundred examples or a few thousand examples, then unfreezing only a few of the final layers and training them is the way to go (less computation required). Let me know if this helps.
My dataset consists of ~30k images, I don’t think thats considered large by any means. The thing is, I am training on my CPU, I don’t have an Nvidia GPU, so each epoch with frozen layers takes around 9 minutes. If I unfreeze the whole model, it takes around 36 minutes, for 5 epochs that’s 3 extra hours. I don’t know if I got used to training on a CPU but 3 hours is okay for me, since I can do something else during the day. ( I did check google colab but I got mixed results in terms of computational performance)
I guess it will get worse when I try out bigger models, because I will train a few and compare them so I might unfreeze the entire model here and for the bigger models, unfreeze some of the final layers.
Indeed your computation time will extend when you unfreeze the entire model (in the case of larger models), and the quoted method is one way to go. Another way to go could be to unfreeze only some of the final layers in each of your models under your comparison study. That way, in my opinion, you would get a much more fair comparison of all your models, and then perhaps, you can try to unfreeze all the layers of the model which only appears to be best after the first round, and train it. I hope this helps.
One last question that just came to me. Since I will unfreeze some layers. I need to pick the correct layer to unfreeze from right? In the case of ResNet50, we have conv blocks, so I can’t just put a random number to unfreeze from, I have to find where does one block ends and the other starts?
So from base_model.summary() I found where conv4 blocks end and where conv5 starts but since its 190 layers I had to find it’s index. This is my code:
# Find layer number you want to fine-tune from
index = None
for idx, layer in enumerate(base_model.layers):
if layer.name == 'conv4_block6_out':
index = idx
break
print(idx)
Output: 153
# Make sure it's the correct index
print(base_model.layers[153])
Output: <keras.layers.merging.add.Add object at 0x0000012550261C90>
Hey @Marios_Constantinou,
The code seems to be correct to me, and seems to be doing exactly what you want it to do. I hope the results are also as per your hopes.