How to ensure your models are not overfiting?

How to know that the model is improving and generalizing and not just overfiting on your data

1 Like

Hi @guandaline! :wave:

You should test it on properly constructed validation and test sets. The validation test comes from the same distribution as the training data, and you should be continuously testing it while training. If you see that your validation set metrics start to worsen, this is the moment when you start overfitting. The validation set should be “hard” enough and be stratified, in other words, ensure that it’s large enough and represents the classes present in the training set well.

When you’ve trained your model and are happy with your validation metrics, you should test your model on the test set. A properly constructed test set is the one that resembles the real-world data (including the classes distribution) as best as possible. Ideally, you should get data points for your test set from a different source than the train & val sets.

There are two good Twitter threads related to your question:

2 Likes

Hi @guandaline !

I also recommend have a look at the free ebook (Machine Learning Yearning) on DeepLearning.AI website.

The website link is: Resources - DeepLearning.AI

I have a better understanding about training & tuning a ML model after reading this book.

Hope you will enjoy it too! :smiling_face_with_three_hearts:

1 Like