Steps after finding the F1 score is bad for skewed data

jack_01234 · November 11, 2021, 4:36am

Thanks Andrew for talking about using precision, recall and F1 score to evaluate the model performance for skewed data. But what is the next step? Suppose we find out that accuracy for the minor label was very bad. From the training process itself, it actually optimizes the loss which to my understanding is more likely the average accuracy. So can we customize the loss function to make the training process optimize the customized metrics, say F1 score? I know adding more weights on the minor label is one way to do so but the performance is not always good based on my personal experience.

Please jump in and share your thoughts.

carloshvp · March 11, 2022, 7:07pm

Hi @jack_01234 ,
I hope I understood your question. You can for sure modify the loss function or do oversampling on the minor class. But as you say, this probably negatively affects the other classes. A more elegant approach, which should give better results, is to do data augmentation on the minor class. Or even better, get more real examples of the minor class if this is possible!

More info about data augmentation comes in one of the videos afterwards:

Did that help?

m.kemarskyi.ip82 · January 5, 2023, 2:38pm

But can’t we embed somehow these precision / recall (f-score) into our cost function in order to encourage a model to maximize these indicators? Or should we simply look at these indicators manually and decide whether the model is good enough for our purpose?

Topic		Replies	Views
Doubt regarding skewed datasets Advanced Learning Algorithms week-3	1	424	June 5, 2023
C5W3 Low precision, F1, and recall of the model Sequence Models coursera-platform	3	516	May 20, 2023
Transformer Network Application: Question Answering Lab Sequence Models coursera-platform	1	526	August 3, 2021
How to improve F1 score in an imbalanced dataset? AI Discussions	1	202	May 16, 2023
Dependence of recall/precision on data set size Advanced Learning Algorithms week-3	2	445	April 16, 2023

Steps after finding the F1 score is bad for skewed data

Related topics