How to tackle class imbalance in NER

I am training an NER model, with vocab size of around 17000. My model is giving me an accuracy of 0.85 but, it is giving me that by classifying each word as 'other ’ tag i.e. the most common tag.

I looked into the class imbalance and that was huge.
{‘B-art’: 185,
‘B-eve’: 165,
‘B-geo’: 14699,
‘B-gpe’: 6644,
‘B-nat’: 100,
‘B-org’: 7787,
‘B-per’: 6729,
‘B-tim’: 7907,
‘I-art’: 123,
‘I-eve’: 139,
‘I-geo’: 2971,
‘I-gpe’: 91,
‘I-nat’: 44,
‘I-org’: 6225,
‘I-per’: 6894,
‘I-tim’: 2422,
‘O’: 350088}

Is there are way to tackle that. I have used 500 epochs for training. Shouldn’t we use different metric such as IOU.

Yeah you probably need other metrics such as precision, recall, and F1 scores. to tackle the issue of imbalance.

Hi @Rvi

To be honest, I would suspect it not to be the class imbalance problem but something else (if the model is predicting “other” all the time). It could be many things from incorrect model implementation, to incorrect preprocessing (lowercase etc.) or just plain simple coding mistake in your data generator.

For example, the model should easily predict “-geo” or “-per” since the words (actual character sequence) are rarely “other” (Vietnam is rarely an “other” tag).

But if it is indeed the class imbalance problem (I would again doubt it) you could assign weights to your loss for tags that are most important to you.

Also, if it’s an option, changing the dataset balance explicitly - removing “uninformative”/ redundant (curating the dataset using expert knowledge) data points from your dataset (like removing “yeah, sure. I’m OK” sentences, etc.).