C2W2 Assignment - Why augment data if results degrade?

Wouter_Favoreel · August 29, 2022, 2:38pm

Hi,

In C2W2 we learn, among many other things, how to augment image data in order to avoid over fitting. In the Assignment exercise, we use the cats and dogs data to test the previously learnt principles. The accuracy I get after 15 epochs and using the same model is:

Without augmentation: Training = 98% and Validation = 89%
With augmentation: Training = 76% and Validation = 82%

Given the above results, why should we augment the training set since the results degrade on the validation set? In the end the result on the validation is what counts no?

Could it be that, due to the augmentation, the training set becomes more general than the test set? The model tries to capture this generalization without success resulting in a degradation of the results on the validation set.

Could a solution be to augment the validation set in the same way as the training set? Most information I find on the internet however says that augmenting the validation set is very uncommon.

Any thoughts?

Best regards,

Wouter

gent.spah · August 29, 2022, 6:36pm

Hello there,

All the set’s metrics count, its an iterative chained process.

This sounds about right.

The purpose of the validation set is to test and adapt by changing hyperameters. The main goal is to fit the training dataset at best possible, since that has the most examples. If for example you augment the entire dataset and you choose a part for validation that might be better I think, as long as that part is representative of the entire distribution. The danger is that if you split a part for validation and augment it, it might drift way from original dataset.

Topic		Replies	Views
C2W2_Assignment.ipynb Convolutional Neural Networks in TensorFlow week-2	8	661	December 3, 2023
Data augmentation on validation set Convolutional Neural Networks in TensorFlow week-1	1	493	September 6, 2022
Does it really make sense to do augmentation on the validation set? Convolutional Neural Networks in TensorFlow week-2	2	593	January 20, 2022
Exploring augmentation with horses vs. humans Convolutional Neural Networks in TensorFlow week-2	1	512	December 7, 2022
C2W2 - Assignment not reaching 80% accuracy Convolutional Neural Networks in TensorFlow week-2	5	693	February 28, 2025

C2W2 Assignment - Why augment data if results degrade?

Related topics