What I want to point out, that I see a definition issue at the core in this thread:
- with more data, @Juan_Olano highlighted rather more features (which also corresponds with higher model complexity not only because of more model parameters due to higher dimensions but also since now “non-linearity” is described in a well-crafted (= modelled) feature. For sure he is absolutely right: this will help to tackle underfitting.)
- with more data, often is meant: more labels
True (besides that this is too little data [number of labels] to entertain a reasonable train, dev, test split. Anyway let’s assume you take this for fitting a model)- however based on the available data I am not sure if this is classic underfitting:
- bias or model residuum would be zero here, wouldn’t it @tbhaxor?
More abstract: in this extreme example you have way too less data [number of labels] to describe the business problem - (but not necessarily to fit a super simple model). In this scenario: I would strongly recommend to take a look at Active Learming that can help to find high quality labels.
We are all in the same page: More data never harms!
The question is: where does it help when it comes to underfitting as written above.