Large model vs small model when augmenting data

In the segment on data augmentation on unstructured data, it is mentioned that if a model is large with low bias, adding more data rarely hurts accuracy.

What is the definition of a large model and how does it differ from a small model? I know Prof Ng mentioned something about a neural network being able to learn from a diverse set of data sources. I was hoping for a more concrete example to understand this concept of model size better.

Hello,

A large model can be seen as a model with many parameters.
For instance a linear model y(x) = a.x + b has 2 parameters (a, b) and can only represent simple functions (linear ones).
If you add one parameter, y(x) = a.x^2 + b.x + c, the model becomes larger and can represent more complicated functions. It can encode more knowledge.

It’s the same for neural networks: if you have many parameters (high number of neurons, layers etc.), the model is large and has the capacity to encode a lot of information, so adding more data (information) should not hurt accuracy.

1 Like