Purpose of eliminating background class

Hi all

Could someone please explain to me why we eliminate the background class from the one-hot labels?

Hello @Jairaj_Mathur ,
Thanks a lot for asking this question. In my answer I will explain why we eliminate the “background class from one-hot labels” and I will provide several articles for you to explore the topic of one-hot labels/encoding on your own.

One hot encoding is the process of converting the categorical data variables to be provided to machine learning models which improves accuracy. One hot encoding is a preprocesing step where you leverage the categorical features.
We eliminate the background class from one-hot labels because we only want the part of the image that indicates to a “class” which would indicate a “category”. If we do not separate the background feature from the main subject it would extract features from the background and learn that information. We do not want our model to learn information about the background but the main subject.

An example to why we remove the background on One Hot labels can be given from classification of Chest X-rays. If you do not remove the background or the outer part of the lung area, it is likely that your model will have bias towards different genders due to the difference in anatomy.

Here are some additional links which may help you!

https://www.codementor.io/@abdelfettahbesbes/one-hot-encoding-in-data-science-1pe0lftu21

I hope I was able to answer your question. If you have further questions please feel free to ask. Thanks a lot for bringing this up in the Discourse Community.

Best,

1 Like

Hi @Jairaj_Mathur

We eliminate the background class from the one-hot labels because the background class typically represents the voxels that do not contain any of the target classes. The background class does not provide any useful information for the task of interest, such as segmentation or classification, and can introduce noise to the model.

For example, in medical image segmentation, the task is to identify the regions of interest (e.g. tumors, lesions, organs) in the image. The background class usually represents the non-relevant regions, such as the healthy tissue. Having the background class in the one-hot labels can make the model focus on the wrong regions, and also can make the training and inference time longer.

Additionally, eliminating the background class can also simplify the problem, making it easier for the model to learn the relevant features. As the model needs to consider only the remaining classes, it can improve its performance and also reduce the risk of overfitting.

It is important to note that in some scenarios, the background class is useful and should be kept. For example, when working with images that have multiple classes with different shapes and sizes, and the shape and size of the background class is different from the other classes, keeping the background class can be beneficial for the model to learn about the shape and size of the classes.

In summary, we eliminate the background class from the one-hot labels because it does not provide any useful information for the task of interest, such as segmentation or classification, and can introduce noise to the model. It can simplify the problem, making it easier for the model to learn the relevant features, and also reduce the risk of overfitting. However, in some cases, the background class can be useful, and it should be kept.

Hope so this answers your question

Regards
Muhammad John Abbas