Course 4 Week 2: Data Augmentation (live vs increasing the dataset)

I would like to know the difference between increasing number of images in the dataset via data augmentation and live augmentation.
Will increasing the number of images be better if we have a bigger network with more compute?
Is there any study when to use what kind of data augmentation (increase the dataset or live agumentation)?
Thank you.

Data augmentation is a set of techniques you can use to increase your training set that don’t require you to acquire more new input images: you can modify the images you already have to make them useful for training. Acquiring new images can also be useful, but is sometimes more expensive or difficult than data augmentation techniques. Then you can apply the augmentation techniques to your new images as well. In other words, the two ideas are essentially independent and complementary.

Thank you for your explanation. What I meant by increasing the dataset in through data augmentation (not acquiring more new images). Let me put it this way to be more concrete.

I have 50, 000 images and I have three augmentation techniques.

  1. I apply each augmentation technique and produce 150, 000 images, and train a network with 150,000 images, or
  2. I apply three augmentation techniques (live). I train the network with 50,000 images with three augmentation techniques.

I just want to know some insights on how to use data augmentation for the best model in terms of implementation.

Sorry, I don’t understand what you mean by your option 2). What is “live” augmentation? If you mean that you just create the augmented images “on the fly”, but don’t save them, then you are still training the network on 150,000 images, right? It’s just that you haven’t saved them statically in the training set.

Actually don’t you have 200,000 images total? The original 50k, plus the 3 variations produced by the augmentation techniques?

But also please note that augmentation techniques do not necessarily produce one-for-one output: if your augmentation is to incrementally rotate the images by a random angle, you could elect to use 3 or 11 or 42 such random angles per input image instead of just 1. It is a choice that you make.

Thank you. Yes. You are right for option 1, when we include original images, we train the network with 200, 000 images.

For option 2, what I have seen from current implementations (e.g., CIFAR-10) is that they only train the network with augmented images (not increasing the number of images), so, 50,000 remains the same. The images are augmented on the fly while loading the batches.

All classifications I have seen so far use augmentation as pre-processing (not increasing the dataset). I wonder if we increase the dataset, will it be better for a bigger model than preprocessing (live augmentation).