C4_W4_Lab_1_First_GAN discriminator training data question

Hello community.

I have a question about the first ungraded lab: C4_W4_Lab_1_First_GAN
My question is about the discriminator training data:

# ... training loop declaration
            # infer batch size from the training batch
            batch_size = real_images.shape[0]

            # Train the discriminator - PHASE 1# Create the noise
            noise = tf.random.normal(shape=[batch_size, random_normal_dimensions])
            
            # Use the noise to generate fake images
            fake_images = generator(noise)
            
            # Create a list by concatenating the fake images with the real ones
            mixed_images = tf.concat([fake_images, real_images], axis=0) # <---------- HERE
            
            # Create the labels for the discriminator# 0 for the fake images# 1 for the real images
            discriminator_labels = tf.constant([[0.]] * batch_size + [[1.]] * batch_size)
            
            # Ensure that the discriminator is trainable
            discriminator.trainable = True
            
            # Use train_on_batch to train the discriminator with the mixed images and the discriminator labels
            discriminator.train_on_batch(mixed_images, discriminator_labels)

It bothers me that the training batch will have all the fake images together and then all the real images. Won’t that produce overfitting?

Or is it the case that when train_on_batch is called on the discriminator there is a bundled-in shuffling mechanism?

If the discriminator consistently receives batches where the first half contains only fake images and the second half only real images, it might learn to distinguish based on the position in the batch rather than learning the actual features that distinguish real from fake images. This could lead to poor generalization.

Am I overthinking this?

The GANs purpose is to produce images that look real, not to fit any particular data. I think the concentration here is if data looks real or fake, the position where it is fed during training does not make any bias because the GAN learns certain hyperparameters independent of the mages position in the batch!

Also to add fabrico there is no where mentioned that the fake images are added in equal amount real images. Here fake images are only used to create a noise or say randomness to check the predictability of how appropriate a GAN can produce a real image.

Hey @gent.spah, thanks a lot for your answer.

I understand that the goal of the discriminator is to tell if an image is real or fake. But if the distribution of real vs. fake images in the training data is not normalized but instead put in halves, couldn’t it be that the discriminator neural net learns the positioning of the image rather than its contents?

Hey @Deepti_Prasad , thanks for your answer.

I think they’re added in equal amount as the real image because the shape of the normal distributions includes the batch size, which is the first component of the real images’ shape.

            batch_size = real_images.shape[0] # <-- HERE 

            # Train the discriminator - PHASE 1# Create the noise
            noise = tf.random.normal(shape=[batch_size, random_normal_dimensions]) 
            
            # Use the noise to generate fake images
            fake_images = generator(noise)

Equality per se I meant in similarity between the two, of course batch_size would be same as real image as the discriminator sees an equal number of fake and real images based on noise generator.

I think positioning of image creates more noise. This part is explained by Lawrence.

Thanks for your reply @Deepti_Prasad
I will watch the videos again. I think I missed that part explained by Lawrence.

No I dont think so!