Why don't we merge augmentated data and original data

WJC · August 25, 2021, 1:30pm

In programming Assignment, We train by replacing the training data with augmented data.

In my opinion,I think we need to merge the training data and the augmented data.

Am I Misunderstanding Data Augmentation?

bisht · August 26, 2021, 4:09pm

Data Augmentation is a technique that can be used to artificially expand the size of a training set by creating modified data from the existing one.
For more information visit

ai_curious · August 26, 2021, 7:54pm

I thought the OP question was more or less ‘is the original training data set contained within the augmented data set?’ which I don’t see addressed by the first reply. Might be helpful to label the images, since it may not be immediately obvious what transformation has been applied and where, if at all, the original image is in that set. Just thinking out loud.

WJC · August 27, 2021, 12:11am

Thank you for applying my question

“However, we can improve the performance of the model by augmenting the data we already have.”
I have a question about the above part of your answer.

Did you use the word “augment” to mean “correction by supplementing the shortcomings in the training data”?

WJC · August 27, 2021, 12:12am

Thank you for applying me ai_curious

it helps me .

ai_curious · August 27, 2021, 10:07am

This isn’t an area I have studied, but in reading the literature and TF documentation I can find easily, it seems like there isn’t a ‘rule’ about whether it’s a merge or a replacement. Many of the built-in capabilities from Keras use random transformations, especially for image manipulation, and generate them on the fly. However, my understanding is that you have the option of saving these generated images, so nothing (other than storage space) would prevent you from creating a merged data set. My intuition is that you’re doing augmentation generally when you have fewer training examples than is ideal, so not clear why you would throw out the original data…if you had more than you needed, you wouldn’t have started down the augmentation path in the first place. HTH

Mubsi · August 27, 2021, 12:20pm

Hi @WJC,

You did understand the concept. It is exactly that as you described.

We know in deep learning the more data we have the better it is. So for example, if we have a data set of 150 images, that’s kind of a low number right ? So we perform data augmentation to increase the size of the data, and yes, ideally, we do include the newly generated images in our data set, to increase its size.

I’m not sure why the programming assignment is only using the augmented images. I’m sure there’s a reasonable explanation for it. Can you tell me which assignment it is so that I can take a look ?

Thanks,
Mubsi

paulinpaloalto · August 27, 2021, 3:06pm

@Mubsi: The data augmentation case being discussed is in the Transfer Learning with MobilNet assignment. That’s C4 W2 A2.

WJC · August 27, 2021, 3:51pm

Thank you for your reply as it really seems to work for me.

Personally, I think I need to study more while looking for a thesis in this field.

WJC · August 27, 2021, 3:53pm

thank you mubsi

Any doubts I had have been resolved.

Mubsi · September 6, 2021, 2:18pm

Hi @WJC,

After looking at the assignment:

We do not replace the augmented data, we pass the data through the augmentation layer which contains a random flip and a random rotation. That means that on some images you will get the original image and on others you will get the transformed one, which is like merging the original and augmented.

Hope this helps,
Mubsi

Topic		Replies	Views
Data augmentation for devset Convolutional Neural Networks in TensorFlow week-2	1	355	September 5, 2023
Course 4 Week 2: Data Augmentation (live vs increasing the dataset) Convolutional Neural Networks coursera-platform	4	542	November 4, 2021
Does Data Augmentation apply only to train data? Machine Learning in Production	2	673	July 12, 2021
Data augmentation - How that works for real? Convolutional Neural Networks coursera-platform	1	516	June 2, 2022
Data augmentation on validation set Convolutional Neural Networks in TensorFlow week-1	2	496	March 14, 2025

Why don't we merge augmentated data and original data

Related topics