Large model vs small model when augmenting data

timmoti · May 28, 2021, 3:27am

In the segment on data augmentation on unstructured data, it is mentioned that if a model is large with low bias, adding more data rarely hurts accuracy.

What is the definition of a large model and how does it differ from a small model? I know Prof Ng mentioned something about a neural network being able to learn from a diverse set of data sources. I was hoping for a more concrete example to understand this concept of model size better.

alc · May 29, 2021, 2:30am

Hello,

A large model can be seen as a model with many parameters.
For instance a linear model y(x) = a.x + b has 2 parameters (a, b) and can only represent simple functions (linear ones).
If you add one parameter, y(x) = a.x^2 + b.x + c, the model becomes larger and can represent more complicated functions. It can encode more knowledge.

It’s the same for neural networks: if you have many parameters (high number of neurons, layers etc.), the model is large and has the capacity to encode a lot of information, so adding more data (information) should not hurt accuracy.

Topic		Replies	Views
Data Augmentation is OK when "model is large" Machine Learning in Production	1	540	March 11, 2022
C1W2 - Can adding data hurt? - What is meant by large model Machine Learning in Production week-2	3	241	March 19, 2024
Difference between small AI model and large AI model AI For Everyone Resources	4	30	June 18, 2025
Diff between Small Model and Large Model Generative AI for Everyone week-1	4	1415	November 10, 2023
Basic Recipe for ML - Week 1 - Train larger/More data? Improving Deep Neural Networks: Hyperparameter tun coursera-platform	6	543	April 11, 2022

Large model vs small model when augmenting data

Related topics