In the example in this video, Andrew used a network trained to predict from 1000 labels, out of which cats were among them. Does the practice of Transfer learning from an already trained Network work for data sets that were not trained upon for the Network architecture? For instance, a network trained on 5 different objects was then adapted for cat classification.
If I am understanding the question correctly, that could be interpreted as a very general question. The answer may be different depending on how far you go with the generality.
Note that Transfer Learning is a flexible process: one of the decisions you need to make when you are deciding to apply it is how deeply (how far back from the output layer) you do the retraining on your training data. Professor Ng does discuss this point in the lectures. You get to decide how you replace the last few layers of the network to match your required outputs and you also get to decide how much of the existing network you retrain and how much you freeze because it doesn’t help to retrain it. The way you make that latter decision would be influenced by the degree of similarity between your training data and the training data of the original base network.
Of course the other thing to note is that the design of a network is constrained by the size and type of the input images. If the trained net was trained on grayscale images and you want to apply it to RGB images, that doesn’t make sense.
Thanks @paulinpaloalto