Pruning in output layer

I came across the following paragraph (image attached) in a paper. The author says that they’ve trained a model with 3-classes (positive, negative and uncertain). However, while testing, they restricted the softmax output to 2 of the 3 classes. How do we do that?

Screenshot 2023-02-27 212931

So they had a model with 3 outputs but at test time they only had 2 outputs. I am guessing what they did at test time is use the entire model trained with the weights till the pre-final layer and replace the 3-softmax output with a 2 softmax output.

This is also called transfer learning with a pre-trained network.

1 Like

Yes that’s a possibility. However, in that case, they had to train the model again so that the activations between the pre-final layer and output layer could be developed. This would fail the purpose of prediction because of the nature of the problem. Let me briefly describe the problem here.
There are a couple thousand images of chest x-ray. Each x-ray has to be either positive (presence of disease) or negative (absence of disease). However, the dataset contains a 3rd class “uncertain” which corresponds to the case where the doctor couldn’t clearly say positive or negative. Aim is to assign this 3rd class as either positive or negative. For this, the author trains a CNN model with 3 classes (positive, negative, uncertain) wth softmax output. And for test purposes, he says that softmax output is restricted to only 2 classes (positive or negative). So using this 2-class softmax output, he would predict the images with “uncertain” class as either positive or negative.
Now if we freeze the model upto the pre-final layer, add a softmax output with 2 neurons, then we would have to re-train the model so that activations between pre-final layer and output layer could be developed. How do we do this?

I see, with transfer learning it would take only a small subset of images (compared to the initial training set) to guide those final activations (i.e. to train the final layer’s weights) to fit our problem’s prediction.