Detach of the Loss class of pix2pixHD

paulinpaloalto · October 10, 2022, 7:23pm

I don’t know the details of this optional assignment, but the general point is that computing gradients is expensive. So they try to avoid computing gradients for portions of the compute graph that you don’t really need for whatever training you are doing at that point. The classic and simplest example is when you train a discriminator. In that case, you don’t need the gradients for the generator, so you detach the outputs of the generator. But when you train the generator, you cannot detach the discriminator: that is because the gradients of the generator depend on the gradients of the discriminator. But we need to be careful not to apply the discriminator gradients when we are training the generator. This is not a “correctness” issue, but just a performance issue. Here’s a thread which discusses this point in the context of the simple case I just described.

Topic		Replies	Views
Discriminator loss - task formulation Apply Generative Adversarial Networks week-3	1	744	June 6, 2022
Assertion Error in get_disc_loss Build Basic Generative Adversarial Networks week-1	6	608	December 29, 2022
Why should we detach the discriminators input ?! Build Basic Generative Adversarial Networks week-4	4	1571	November 30, 2022
Reason for detach not being called in generator loss function Build Basic Generative Adversarial Networks week-1	3	571	June 12, 2023
GAN Course 1 Week 1 Assignment 1 Exercise Training Build Basic Generative Adversarial Networks week-1	4	682	November 19, 2022

Detach of the Loss class of pix2pixHD

Related topics