Optimization Order

saileshbaidya · July 8, 2024, 5:41pm

In the 1st assignment “Your First GAN”, we have chosen to run forward and backward propagation on the discriminator first and then the generator. Is there any specific reason to choose that order? Can we reverse the order?

Thanks,
Sailesh

paulinpaloalto · July 8, 2024, 6:48pm

Interesting question. It’s been quite a while since I watched the lectures for this course, so I forget whether Prof Zhou discusses this point. Just on general principles, the discriminator has inputs some of which are real images, so maybe it has a better chance of being able to learn something meaningful with no previous training of the generator. Of course training the generator depends on the feedback of the discriminator, so maybe it’s better to start in the given order meaning you’re more likely to start making meaningful progress even on the first iteration if the discriminator’s feedback to the generator is not just purely random.

But that’s just my intuition. We do all this in a loop that gets repeated many times, so maybe it doesn’t matter which you do first. Try it the other way and see what happens. Science!

saileshbaidya · July 8, 2024, 10:17pm

Thanks @paulinpaloalto, your response aligns with my research. Here is what I found: by updating the discriminator first, we ensure that it is accurately distinguishing between real and fake images. This makes the task for the generator more challenging, which is essential for its learning process. If we update the generator first, we use an untrained discriminator to compute the loss. Since the discriminator is not yet updated, the feedback to the generator might not be reliable.

paulinpaloalto · July 8, 2024, 10:22pm

Yes, that makes sense, but it’s only a question of what happens on the first iteration, right? Once we’re past that then we’re off to the races. Typically we’re doing at a minimum hundreds and a lot of times more like thousands or tens of thousands of training iterations, so maybe this is all in the noise in reality.

But still, might was well start off on the “right foot” if we can. Or to put it another way, why waste the first iteration if you don’t have to?

saileshbaidya · July 8, 2024, 10:36pm

I believe it should only matter in the first iteration, but not 100% sure. It might have chain effect in the subsequent iterations and might delay the convergence.

Topic		Replies	Views
GAN.C1.W2.Assignment.Training Build Basic Generative Adversarial Networks week-2 , week-3	2	555	October 31, 2022
Doubt about question asked in "Putting It All Together" Build Basic Generative Adversarial Networks week-1	2	517	January 8, 2023
Alternately training the Generator and the Discriminator Build Basic Generative Adversarial Networks week-1	4	586	November 26, 2021
Week 1 Programming Assignment - Conceptual Question Build Basic Generative Adversarial Networks week-1	8	735	August 25, 2021
Why "superior discriminator" leads to "no way to improve"? Build Basic Generative Adversarial Networks week-1	3	600	July 7, 2022

Optimization Order

Related topics