Data Augmentation is OK when "model is large"

Andrew explained that adding synthetic data (for problems with unstructured data) would not hurt accuracy if model is large (low bias). What does “Model is large” mean?

Hi @Elaine , welcome to the forum!
I believe you refer to this video, can you confirm?

By “Model is large” is meant that the model has many layers and nodes. If you have a small model (e.g. few layers and nodes per layer), then the model is affected too much by adding new examples and could reduce overall performance.

If you are confused by this, maybe you want to watch this about the bias/variance tradeoff:

Did that help?

1 Like