Class imbalance problem

I am practicing ML on a dataset and I encounter that the dataset have a class imbalance target column and then I read a lot of stuff on internet and read a lot about different techniques i.e under sampling, over-sampling, accuracy metrics etc. But I don’t understand what steps should we follow in these case. Like, if I am using over-sampling technique then how should I select train/dev and test data and should I considered the over-sampled data as my new data or I just fit my model on that data and try to validate it on my original dataset.

Hi @nomi , welcome to the DLS !

Fortunately some of those questions are introduced in Course 2: Improving Deep Neural Networks: Hyperparameter Tuning, Regularization and Optimization, and most of them are explained in detail in Course 3: Structuring Machine Learning Projects

So you are in the right place !!

1 Like

ohh that’s nice. I am currently learning from course 2 and now I am very much excited for course 3 after your this reply. I hope I will get the answer specific to my question.

Hey @nomi , In addition to Course 3 where the concepts treated, there is also in the AI for Medical Diagnosis a week of Videos and Exercises dedicated to solving the Class Imbalance problem:
class-imbalance

1 Like

@javier I hope prof. Andrew Ng have also discussed that what should steps(Pilpeline) we should follow in these problems. Thanks.