When to use Transfer Learning

Dave_Espinosa · November 2, 2021, 5:40pm

In this week, Andrew provided some general hints about “When transfer learning makes sense”, summarized in the following ones:

Task A and B have the same input x.
You have a lot more data for Task A than Task B.
Low level features from A could be helpful for learning B.

In the example used, he mentioned some image classifier, in which some medical image researcher have found out that the arquitecture of the classifier might serve well, to build some sort of medical image classifier. So far, (and theoretically) so good. However, when “Low level features from A could be helpful for learning B” is mentioned, does Andrew mean “having the same distribution”?

Say for instance the classifier is an “animal classifier” (i.e. a NN that based on some animal pic, it will tell me if its a cat, a dog, a horse, etc) that was trained using 10M images; say also that I have 10K images of x-ray medical images, of patients with / without osteoporosis, and I want to make a classifier that allows me to predict if a patient suffers from that illness, or not. Provided the resolution and size of the images is the same, we would have the first and second conditions checked, however the third would NOT be checked, since the distribution is not the same at all, and so my hypothesis would be true…

Is my assertion correct?

jonaslalin · November 2, 2021, 9:08pm

I dont exactly understand what you mean by same distribution, but I don’t think that is needed. For example, someone has trained a system to classify different breeds of cats. Then you want to classify dogs. The distributions are different, but the low level features are the same, so you would probably benefit from transfer learning.

ai_curious · January 18, 2022, 4:15pm

@jonaslalin has it exactly correct. But since picture == 10^3*words …

here’s an example of some low-level features extracted by a CNN. Did those come from cat images? dogs? Don’t know and don’t need to know. The caveat is that the low level features extracted from animal images might not be very helpful training a CNN to predict crop yield from aerial photos of fields (for example).

paulinpaloalto · January 18, 2022, 6:49pm

Very cool! Sorry, but I can’t help myself: gotta pile O(10^2) more words onto your eloquent pictures.

Thanks for the concrete example demonstrating the point that Prof Ng mentions at a number of different places in the lectures that the “early” layers of a network learn very low level features like edges and curves and then the later layers integrate those low level features into detectors for higher level features. When applying Transfer Learning, another decision you need to make is at what point in the network you need to either do more training on your specific inputs or whether you need to truncate the existing network and supply new final layers that will allow it to do well on your specific problem. Prof Ng discusses these issues in the lectures.

One other caveat in the specific example of whether you could apply a pretrained animal classifier (or at least some of its early layers) to a medical image classification problem is that the types of the images may be different. The animal pictures are probably one of the standard image representations (RGB, CMYK …), but medical images are typically greyscale, aren’t they? It’s possible that difference would not end up being significant (shades of grey are colors after all), but generally speaking you need to keep in mind that Neural Networks are very very specific about their inputs. They all need to have the same number of pixels and be of the same type. You could convert your medical images to an RGB representation and see what happens. Image libraries typically provide all sorts of transformations that can be used as “preprocessing” to get your images into the form expected by the pretrained network in question.

Dave_Espinosa · January 18, 2022, 7:38pm

Hello @jonaslalin ,

Thanks a lot! The explanation was crystal clear. I even saw that “adding the cherry on top of the cake”, @ai_curious shared one example of “low-level features” (thank you too!), but the first explanation was good enough to have everything solved. A bit afterwards @paulinpaloalto , even added another scenario, where even color could make “low-level” features different (which BTW I haven’t even considered at that point!), to make foundations even stronger!

In summary (and using my own examples): if you have a cat picture classifier, you might use it to train a new dog picture classifier (provided both of them have the same resolution, color, etc.), since the low-level features might end up being very similar (both have 4-legs, both have a head, a snout, etc); however the cat classifier might perform very poorly in a bone-disease classifier transfer learning, since they share almost none “common low-level” features .

I think that pretty much solves this thread.

Best regards.

Topic		Replies	Views
Week 2, Transfer Learning Convolutional Neural Networks coursera-platform	1	534	January 14, 2022
Question About Transfer Learning Convolutional Neural Networks coursera-platform	5	632	May 31, 2022
Transfer learning with large data Structuring Machine Learning Projects coursera-platform	2	572	November 2, 2021
Transfer Learning Explanation Convolutional Neural Networks coursera-platform	5	564	November 30, 2023
Transfer learning: Can you have different output data (y)? Structuring Machine Learning Projects coursera-platform	1	509	April 27, 2023

When to use Transfer Learning

Related topics