Does it really make sense to do augmentation on the validation set?

Manu · January 19, 2022, 5:37am

Here is the way I think about this:

We use the trainSet to fit our model
We want a wide variety of images within this trainSet, to make sure that when our model will meet images in real life, it will be able to do a correct prediction
We can then use a validation set and a test set
The validation set will check if the model developed is performing well and will enable the fine tuning of our model
The test set will only be used once at the end, to make a final check of model performance / accuracy

In our example we are just using a trainSet and a validationSet

The validationSet is used to assess the model and fine tune it

Conclusion:

I cannot see the point of augmenting the validation set.
This looks to me as a distortion of realiy in order to get better results when assessing our model when running it on the validation set
By augmenting the validation set, we will ensure that the trainSet and validationSet are more similar and thus get better results when assessing our train-validation results
But in the real life, our model will perform worse, because we will have fined tuned it on images that were not real, but artificially augmented.
We just caused a situation that allowed us to be over-optimistic about the model accuracy results on the the validationSet

Therefore, does it really make sense to do augmentation on the validation set ? I believe not…
Am I missing something ?

balaji.ambresh · January 19, 2022, 2:57pm

The only data augmentation done to the validation dataset is rescaling to pixel values to fall in range [0, 1]. This is to match the pixel value range the network was trained on. Reason for rescaling is to drive weights to small numbers.

Other than that, you’ll not do additional transformations like rotate / brightness etc.

Manu · January 20, 2022, 6:30am

Absolutely Balaji, we completely agree then

Topic		Replies	Views
Data augmentation on validation set Convolutional Neural Networks in TensorFlow week-1	2	496	March 14, 2025
Is it useful to augment images in the validation set? Convolutional Neural Networks in TensorFlow week-2	7	924	January 6, 2022
Why would we augment validation data Convolutional Neural Networks in TensorFlow week-2	1	504	July 16, 2022
C2W2 Assignment - Why augment data if results degrade? Device-based Models with TensorFlow Lite week-2	1	551	August 29, 2022
Exploring augmentation with horses vs. humans Convolutional Neural Networks in TensorFlow week-2	1	512	December 7, 2022

Does it really make sense to do augmentation on the validation set?

Related topics