Add more Training Data to prevent overfitting

Yuhan_Zhang · January 12, 2023, 2:47am

If we add more training data, wouldn’t the model fit better to the training sets (thus overfit and less generalizable)? So why would we add more training data to prevent overfitting? Shouldn’t we add more testing data instead?

rmwkwok · January 12, 2023, 2:56am

Hello @Yuhan_Zhang,

It should be the opposite. Adding more training data but keeping the same model architecture would make your model harder to fit well to every training data point. It’s like you can cater for the different dinner perferences for 10 kids, but if you have 100 kids, then it is going to be more challenging. The more data points you get, the more compromise your model will have to make to fit less well to the original points in order to make room for fitting better to the new points.

Raymond

TMosh · January 12, 2023, 3:19am

Adding more training data creates more variance. This helps mitigate against overfitting.

Topic		Replies	Views
Week1 Quiz Problem Improving Deep Neural Networks: Hyperparameter tun	1	544	May 25, 2022
Why do we need a lot examples to train a ML model? Unsupervised Learning, Recommenders, Reinforcement week-3	2	491	August 6, 2022
Addressing Overfitting Supervised ML: Regression and Classification week-3	2	514	July 11, 2022
Basic Recipe for ML - Week 1 - Train larger/More data? Improving Deep Neural Networks: Hyperparameter tun	6	541	April 11, 2022
help,I have a question about adding more picture to your training example Improving Deep Neural Networks: Hyperparameter tun	4	546	November 29, 2021

Add more Training Data to prevent overfitting

Related topics