Confused: For transfer learning, where will be the input node?

Cutting off the outer layers of a neural network and appending fresh layers in order to train new inputs confuses me. This is how I understand transfer learning.

I want to know if the new input node corresponds to that of the originally pretrained network or if the new input node will start from the middle (where the old network was truncated).

From previously learnt theory, training the newly appended layers will also update the weights of the previously trained model. I suspect that this could work if there was a way to stop the original layers from updating while updating only the newly appended layers.

If the above logic is correct, after the transfer learning is complete, asking the model to predict objects from the current data will work. but asking the model to predict from a dataset which was previously used to train the model (before transfer learning happened), such input will have to pass through the new layers to arrive at an output. And I am afraid these new layers will not act well on the old data because they were not trained on the old data.

How does transfer learning circumvent such constraints?

Please, pardon me if I am not clear enough. Deep learning is really difficult to explain in writing. Pictures and words do a better job.

Hi, @Chuck!

You are right. The vanilla transfer learning would be removing the last layer(s) of the pre-trained network and appending a new one.

I suspect that this could work if there was a way to stop the original layers from updating while updating only the newly appended layers.

That is true as well and it is indeed the way it is supposed to be. You should freeze the pre-trained layers to “warm up” the new ones. After a couple of epochs, you can unfreeze them to fine-tune the network.

I am afraid these new layers will not act well on the old data because they were not trained on the old data.

That is the purpose of transfer learning. The new network will not perform that well on the old data, which is totally fine because that was not your intention. You use transfer learning to achieve good metrics on the new dataset

1 Like

Awesome!! Perfectly understood! I had the assumption that the new model would still work on old data.

I would like to ask you @alvaroramajo, should I be bothered that transfer learning might be an overkill in many applications?

Thank you very much.

Yes, it could be some kind of overkill, but it is a great starting point in the optimization process. As you have not pre-trained those weights you should not worry about it.

1 Like