Why epsilon is having requires_grad=True for gp?

for _ in range(crit_repeats):
            ### Update critic ###
            crit_opt.zero_grad()
            fake_noise = get_noise(cur_batch_size, z_dim, device=device)
            fake = gen(fake_noise)
            crit_fake_pred = crit(fake.detach())
            crit_real_pred = crit(real)

            epsilon = torch.rand(len(real), 1, 1, 1, device=device, requires_grad=True)
            gradient = get_gradient(crit, real, fake.detach(), epsilon)
            gp = gradient_penalty(gradient)
            crit_loss = get_crit_loss(crit_fake_pred, crit_real_pred, gp, c_lambda)

In the above code, we can see that for each iteration we are initializing epsilon. But why does it have requries_grad=True?
Because we are trying to calculate gradients of the critic(mixed_images) w.r.t mixed_images.

mixed_images = real * epsilon + fake * (1 - epsilon)

@Raviteja, you are right about why this is - it’s because epsilon is used in calculating mixed_images and mixed_images is passed to the critic and used in the critic’s calculations of mixed_scores. autograd() needs the tensors used in calculating the output value to have requires_grad = True.

As a test, you can try commenting out epsilon’s .requires_grad_() in test_get_gradient() and see the error autograd() returns.

Setting requires_grad on epsilon is just one way of making sure requires_grad is True for the parameter that’s passed to the critic. For example, another approach would be to set it directly on the parameter passed to the critic like this:

mixed_scores = crit(mixed_images.requires_grad_())

The main thing is to make sure requires_grad is set for everything autograd needs

1 Like