Why epsilon is having requires_grad=True for gp?

Raviteja · April 18, 2022, 1:41pm

for _ in range(crit_repeats):
            ### Update critic ###
            crit_opt.zero_grad()
            fake_noise = get_noise(cur_batch_size, z_dim, device=device)
            fake = gen(fake_noise)
            crit_fake_pred = crit(fake.detach())
            crit_real_pred = crit(real)

            epsilon = torch.rand(len(real), 1, 1, 1, device=device, requires_grad=True)
            gradient = get_gradient(crit, real, fake.detach(), epsilon)
            gp = gradient_penalty(gradient)
            crit_loss = get_crit_loss(crit_fake_pred, crit_real_pred, gp, c_lambda)

In the above code, we can see that for each iteration we are initializing epsilon. But why does it have requries_grad=True?
Because we are trying to calculate gradients of the critic(mixed_images) w.r.t mixed_images.

mixed_images = real * epsilon + fake * (1 - epsilon)

Wendy · April 20, 2022, 6:24pm

@Raviteja, you are right about why this is - it’s because epsilon is used in calculating mixed_images and mixed_images is passed to the critic and used in the critic’s calculations of mixed_scores. autograd() needs the tensors used in calculating the output value to have requires_grad = True.

As a test, you can try commenting out epsilon’s .requires_grad_() in test_get_gradient() and see the error autograd() returns.

Setting requires_grad on epsilon is just one way of making sure requires_grad is True for the parameter that’s passed to the critic. For example, another approach would be to set it directly on the parameter passed to the critic like this:

mixed_scores = crit(mixed_images.requires_grad_())

The main thing is to make sure requires_grad is set for everything autograd needs

Topic		Replies	Views
[C1_W2_Assignment] Discriminator training code: Why don't use "fake.requires_grad_ = False" instead of "fake.detach()" Build Basic Generative Adversarial Networks week-2 , week-3	3	581	December 5, 2022
Course 1 Week 3, assignment code, usign retain_graph Build Basic Generative Adversarial Networks week-3	1	343	February 4, 2022
Training process is not doing well Build Basic Generative Adversarial Networks week-3	5	350	April 16, 2022
A question about WGAN's objective function Build Basic Generative Adversarial Networks week-3	4	390	December 11, 2022
C2W1 Grad Check: why is the Epsilon used to estimate grads also used as threshold in the check? Improving Deep Neural Networks: Hyperparameter tun coursera-platform	2	355	October 11, 2023

Why epsilon is having requires_grad=True for gp?

Related topics