Data augmentation for devset


I got a question. it seems that the best practice is to augment data only for training split.
the reason behind no doing so for other splits is said to be augmented might not represent reality.

If we don’t want it in devset because of that reason, why would we like to train our model over such data?


Hello Salman,

Sometimes acquiring and labelling additional observations can be an expensive and time-consuming process.
Data augmentation techniques are used to generate additional, synthetic data using the data you have.

Augmentation method like rescaling/cropping, flipping, noise, rotation to get larger training data and make the model generalise better.

Although in the real world data preparation is done first and then splitting the dataset. But supposedly data augmentation of test data will create additional sample data which you can avoid based on one’s choice of predictive analysis .

However one needs to understand the main significance of data augmentation is generation of additional in case you are unable to get more data, or you due to economical and time-constraint conditions at the time of prediction analysis.