GAN.C1.W2.Assignment.Training

Juan_Olano · October 30, 2022, 1:07pm

Hi!

In lesson and in the original Training, we first train the Discriminator and then the Generator.

I went ahead and did an experiment and switched this and, in the loop, trained first the Generator and then the Discriminator. Empirically, the results on the losses in the generator seem a bit better with this new order.

Original:
Epoch 3, step 1500: Generator loss: 3.9927062740325936, discriminator loss: 0.024210385922342533

Experiment 1:
Epoch 3, step 1500: Generator loss: 2.067182400345801, discriminator loss: 0.30279242809861917

Then I did an Experiment 2:
In lecture and in the original training algorithm, the training of the Disc uses FAKE_1 from Gen, and training of Gen uses a new FAKE_2 image. In short, in each cycle, the Gen generates 2 images, one to train the Disc and one to train itself. In my experiment 2 I decided to use the same image for both trainings.
The empirical result: the losses are even better in the Gen with this configuration:

Original:
Epoch 3, step 1500: Generator loss: 3.9927062740325936, discriminator loss: 0.024210385922342533

Experiment 2:
Epoch 3, step 1500: Generator loss: 1.6950061668455585, discriminator loss: 0.3041314163953065
Epoch 21, step 10000: Generator loss: 0.7340306047201158, discriminator loss: 0.6788859988451004

I wonder if this is just silly playing, or if there is something to be said about it?

Is it important to first train the Disc and then the Gen? or is it irrelevant?
Is it important to use two different fake images, or is it irrelevant?

Thanks!

Juan

Wendy · October 31, 2022, 4:46pm

Interesting, @Juan_Olano!
I would have guessed, especially for Experiment #1 that it wouldn’t make a significant difference. You’re going back and forth between generator and discriminator thousands and thousands of times during training. It seems surprising that it would matter if you started first with generator or discriminator.

But, that’s just intuitively what I would have thought. Maybe I’m overlooking something, which your results suggest. One thing, though - did you make sure to choose “Refresh and Clear Outputs” between each of your test cases to make sure you were starting at the same starting point for each test?

Juan_Olano · October 31, 2022, 5:34pm

Hi @Wendy , thank you for taking the time to answer my ticket!

Regarding your question about refreshing and clearing outputs, yes, absolutely, I added that. Torch accumulates these outputs across all iterations, so, if not cleared, each batch would be subject to the outputs of all previous batches.

I left original and both experiments run for a few thousand iterations, and the end they were actually pretty similar, so may be it doesn’t really matter that much, at least for this application.

Thanks again,

Juan

Topic		Replies	Views
Optimization Order Build Basic Generative Adversarial Networks week-module-1	4	19	July 8, 2024
Week 1 Programming Assignment - Conceptual Question Build Basic Generative Adversarial Networks week-module-1	8	738	August 25, 2021
C4_W4_Lab_1_First_GAN - why don't we set generator.trainable? Generative Deep Learning with TensorFlow week-module-4	3	40	July 18, 2024
Why does generator loss increases in Assignment 1? Build Basic Generative Adversarial Networks week-module-1	4	673	February 26, 2023
Discriminator given the labels? Build Basic Generative Adversarial Networks week-module-1	2	528	May 23, 2022

GAN.C1.W2.Assignment.Training

Related topics