What is the difference between pre-training and fine tuning in transfer learning?

Shivam_sharma_MED_07 · November 27, 2022, 3:05pm

I watched the lecture on transfer learning but I am still confused between pre-training and fine-tuning. All I understood was that if we re-initialize only the final layer’s weights and then retrain on the new X and Y then this is called fine-tuning. But if we re-initialize the weights of all the layers and then re-train it on new X and Y then this is called pre-training. Correct me if I am wrong, please.

What if I don’t re-initialize any weights at all and re-train on a new dataset, do we have a term for that in the literature?

Christian_Simonis · November 27, 2022, 3:16pm

Hi there

just to understand correctly: what do you mean with „don’t change any weights“. If none of your weights are trainable or can be changed, in fact you cannot train your net.

In general: if you would train a new model from scratch you can just call that „training“. The terms pre-Training and fine tuning are especially relevant for large models with tons of parameters (=weights), that you train with tons of data. In this context, you can:

take a pertained net (e.g. for classification of animals) to benefit from the already learned layers and filters specific to a broader problem (e.g. animal classification). Pls. consider that this is a huge value to just use this pretrained model, taking into account the training time that you save.
and use this for fine-Tuning within a reasonable time and with reasonable effort for a more specific problem (like dog breed detection). Here you can benefit from intermediate features from mid layers (e.g. edges of animals heads or features that characterise paws), but with the trainable parameters (weights) you would fine tune the model to your specific domain (e.g. dogs), where you need to provide the right labels (e.g. pictures of dog breeds as training pictures).

Hope that helps!

Best
Christian

rmwkwok · November 27, 2022, 8:41pm

Hello @Shivam_sharma_MED_07,

Welcome to this community! I think you are asking for the case when you just continue from where the model had been trained to. I don’t recall any specific term for this, and all I would say is I am “continue training” a model “with a different dataset of my own”.

Raymond

Juan_Olano · November 27, 2022, 9:21pm

Hi @Shivam_sharma_MED_07 ,

Regarding pre-trained models: You can find and download from the internet models that are ‘ready to use’ - these are pre-trained models. Using these pre-trained models will save us a lot of time if the model was trained for my exact objective. For example, if I want to classify cats in general, I can download a model pre-trained in cats, or even a model pre-trained in domestic animals that include cats.

Now, sometimes my need is more specific and I cannot find a pre-trained model that can do that. For example, lets say I want a model to classify “yellow toy puddles walking in the beach in a rainy day”. Very specific, right? For this case I have 2 options: I can either start a model from scratch, or I can download a pre-trained model on cats or domestic animals, like those mentioned above. Lets say that for this case I have about 200 pictures of “yellow toy puddles walking in the beach in a rainy day”. In this case, what I can do is take one of the pre-trained models, ‘cut’ the last layer(s) of the model, and add some new layers. In this case, I want to train only the layers I just added, and keep ‘fixed’ the previous layers (keras offers a property to freeze training per layer). In this case, I am fine-tuning the model to my very specific needs. Note that in this case, the weights of the original layers will remain the same, and only the new layers will be affected by the back-propagation.

And then we have re-training. In this case, I can download one of the models from the internet, and run the training with my training set, and for this case I would allow all layers to be updated in the backprop. In this case, your weights will start from value that have already learned a lot about features and this can be useful given that you may have a small number of samples for your new objective of “yellow toy puddles walking in the beach in a rainy day”.

What do you think?

Juan

Topic		Replies	Views
Training Data Ideal Approach for Transfer Learning Convolutional Neural Networks in TensorFlow week-3	2	516	January 21, 2023
Transfer learning why it works? Advanced Learning Algorithms week-3	2	55	February 6, 2025
Why specify twice that the base model is not trainable? Convolutional Neural Networks	7	672	July 31, 2022
Pre-training for Adaptation Generative AI with Large Language Models week-1	4	408	July 31, 2023
Transfer learning Structuring Machine Learning Projects	8	706	April 2, 2024

What is the difference between pre-training and fine tuning in transfer learning?

Related topics