Hi,
Assume that a CNN model is to be developed to recognize commercial domestic planes flying in the sky. The training data should include images of flying domestic planes for true positives. Additionally, it should encompass other types of aircrafts, such as private jets and helicopters. Should the training data also include instances with no flying aircraft so that the output layer has two outputs: Domestic plane and Non-domestic plane? Or would it be better to have three outputs in the output layer: Commercial domestic plane, Non-commercial domestic, Non flying aircraft?
Train dataset should resemble the actual usage of your model. Assume that you have a camera facing the sky that’ll take pictures and identify the aircraft. With this setup in place, you don’t need pictures of aircraft on the ground since the camera will never see them once your model is deployed.
If you have very few data points of aircraft in the sky and even transfer learning doesn’t help much, try classifying planes on the ground in the 1st phase. After that, change the head of the model and fine-tune it again on your target dataset.
See this to learn about multitask learning.
As far as the number of classes is concerned, you should be the one to decide the number of classes based on how your model is going to be used.
Thank you for your answer. I was wondering if grouping “free aircraft sky” along with “aircraft but not domestic plane” samples under the same label would make the training harder and performance lower compared to separating them into two classes?