Transfer Learning of U-Net

weic · April 27, 2022, 12:32am

Hi,

In the previous week assignment we did transfer learning on the MobileNetV2. I was wondering how to do similar transfer learning for U-Net. I assume for YOLO algorithm it would be like what we did for MobileNet2, and we could re-train the layer that outputs the prediction, or maybe last few layers of the encoding in order to do transfer learning.

For U-Net though, the high level representation are at the bottom of the U shape, and if I want to re-use the U-Net with as a pre-trained model, but working on my own type of segmentation objects, (for example a indoor scene or a scene from video game), does that mean I should retrain only the bottleneck layer, and re-use data from the encoder and the decoder?

Thanks,
Wei

paulinpaloalto · April 27, 2022, 3:30am

It’s an interesting question. I don’t know the answer and do not have any actual practical experience with transfer learning, but I think the principles as Prof Ng has explained them to us are that you need the beginning part of the network to remain the same from the input through some point in the architecture and then you change the later layers of the network and do incremental training at least on the changed or added layers at the end.

I would be a little worried about the idea of changing the input layers, but leaving the later bottleneck layers as is. Maybe it would give some advantage to retrain all the layers, but starting from the pre-trained weights instead of starting completely from scratch with random weights.

You can try doing some google searching and see if anyone discusses how to do transfer learning with U-Net. Or if you try any experiments, please let us know what you find out!

SomeshChatterjee · April 27, 2022, 4:42am

Hi Wei,

Also, in addition to what Paul has mentioned, you can refer to a similar question asked previously here.

weic · April 27, 2022, 2:46pm

Thanks for replies!

In the other post’s reply by jonaslalin, they mentioned the idea that which layers to fine tune is just another hyperparameter, which seems to align with the IEEE paper in that reply. The IEEE paper found that fine tuning the contracting layers and freezing the expanding layers helped their case, because they suspect due to low level feature difference in input image sets. I definitely plan to look at their code to maybe try play around with it.

I have another weird question, if I start from a pre-trained weights, but did not freeze any layers, and when I get to a satisfying new model, is there some metrics for comparing the new model with the old model? Like could I compare the weights per layer, and come up with a conclusion saying which layers are more different to the original model and thus come back saying that if there is not that much computational power, these layers should be fine tuned?

Thanks,
Wei

paulinpaloalto · April 27, 2022, 3:00pm

That’s a really interesting idea to try to analyze after the incremental training to see if you can see any disparity in how the different layers were affected. One idea would be just to take the norm of the differences between the weights at each layer and see if there is any pattern in those differences. E.g. try something like:

||W_{before}^{[l]} - W_{after}^{[l]}||

Cool idea! Let us know if you try that and what you learn from it!

Topic		Replies	Views
U-Nets and transfer learning Convolutional Neural Networks	1	574	November 2, 2021
Questions about transfer learning in "Transfer_learning_with_MobileNet_v1" Convolutional Neural Networks	10	752	December 3, 2022
Week 3 - Assignments 1 & 2: Why Do We Input Data into the Models Differently? Convolutional Neural Networks	6	523	November 23, 2022
Help with Week2- Programming Assignment2: Transfer Learning with MobileNet Convolutional Neural Networks	1	503	September 12, 2022
Transfer learning with YOLO Convolutional Neural Networks	2	531	August 27, 2021

Transfer Learning of U-Net

Related topics