In the MobileNet exercise we implemented the alpaca model in two steps:
Step 1. Using pretrained mobilenet and altered the final two layers
Step 2. taking the model we trained in step 1 and unfreezing layers from 120 onwards and finetuning the parameters
the number of epochs was set to something around 4-5 in each step.
It’s probably a learnable parameter and the answer would be “it depends”.
But is there any guidance/research/rule of thumb for a starting point in splitting the total number of epochs between the “frozen” parameters stage and “fine tuning” stage?