RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

Hi, I get a RuntimeError when the test_disc_reasonable() method calls disc_loss.backward(). The error trace is:


RuntimeError Traceback (most recent call last)
in
71 break
72
—> 73 test_disc_reasonable()
74 test_disc_loss()
75 print(“Success!”)

in test_disc_reasonable(num_images)
32 criterion = lambda x, y: torch.sum(x) + torch.sum(y)
33 disc_loss = get_disc_loss(gen, disc, criterion, real, num_images, z_dim, ‘cpu’).mean()
—> 34 disc_loss.backward()
35 assert torch.isclose(torch.abs(disc.weight.grad.mean() - 11.25), torch.tensor(3.75))
36

/usr/local/lib/python3.6/dist-packages/torch/tensor.py in backward(self, gradient, retain_graph, create_graph)
193 products. Defaults to False.
194 “”"
→ 195 torch.autograd.backward(self, gradient, retain_graph, create_graph)
196
197 def register_hook(self, hook):

/usr/local/lib/python3.6/dist-packages/torch/autograd/init.py in backward(tensors, grad_tensors, retain_graph, create_graph, grad_variables)
97 Variable._execution_engine.run_backward(
98 tensors, grad_tensors, retain_graph, create_graph,
—> 99 allow_unreachable=True) # allow_unreachable flag
100
101

Interesting! My guess is that it must be something to do with how you are using “detach()” to avoid computing gradients on the generator. Did you perhaps apply detach also on the real images being fed to the discriminator? I tried a couple of experiments with that and could not reproduce the error trace that you show. What the error message is telling you is that it can’t find the gradients it needs on the discriminator loss in order to perform the back propagation.

Start by checking for the uses of detach in your code: it should be just on the fake images before they are fed to the discriminator.

Have you used pytorch before you started on the GANs specialization? Maybe you know some other pytorch mechanisms for avoiding propagating gradients.

1 Like

Hi, I have the same problem. I have no idea why it happens.

noise = get_noise(n_samples=num_images, z_dim=z_dim, device=device)
fake = gen(noise)
fake_preds = disc(fake)
fake_loss = criterion(fake_preds.detach(), torch.zeros_like(fake_preds))
real_preds= disc(real)
real_loss = criterion(real_preds.detach(), torch.ones_like(real_preds))

One thing to try is not to detach the real predictions. The point is we only want to detach the fake predictions, because we don’t want to waste the cpu cycles to compute gradients on the generator. But we need the gradients for the discriminator, right? That’s because the point here is to train the discriminator.

I would also recommend that you try detaching the fake images before you pass them to the discriminator.

1 Like

Thank you for the advice, when I do not detach the real predictions, the specific error does not presents itself.

Here’s a thread on why we sometimes want to detach outputs.