Here is the way I think about this:
- We use the trainSet to fit our model
- We want a wide variety of images within this trainSet, to make sure that when our model will meet images in real life, it will be able to do a correct prediction
- We can then use a validation set and a test set
- The validation set will check if the model developed is performing well and will enable the fine tuning of our model
- The test set will only be used once at the end, to make a final check of model performance / accuracy
In our example we are just using a trainSet and a validationSet
- The validationSet is used to assess the model and fine tune it
Conclusion:
- I cannot see the point of augmenting the validation set.
- This looks to me as a distortion of realiy in order to get better results when assessing our model when running it on the validation set
- By augmenting the validation set, we will ensure that the trainSet and validationSet are more similar and thus get better results when assessing our train-validation results
- But in the real life, our model will perform worse, because we will have fined tuned it on images that were not real, but artificially augmented.
- We just caused a situation that allowed us to be over-optimistic about the model accuracy results on the the validationSet
Therefore, does it really make sense to do augmentation on the validation set ? I believe not…
Am I missing something ?