Here is my code for the get_disc_loss and I think I am detaching the generator correctly, but I am still getting an assertion error:
{moderator edit - solution code removed}
Here is my code for the get_disc_loss and I think I am detaching the generator correctly, but I am still getting an assertion error:
{moderator edit - solution code removed}
Note that detach()
is not an “in place” operation: it returns a new instance of the object with the graph detached. So that means you are discarding the detached graph and that disc_pred_fake
has already been created with the non-detached instance of gen_img
in any case, meaning that your detach
happens too late.
I have a similar error. I detached the generator from the predictions before I computed the loss for the fake images (and did not detach for the real images), but I am seeing the same error at runtime.
{moderator edit - solution code removed}
The error is as follows -
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn
I tried moving the detach() around but haven’t had any luck. Can you help point me in the right direction?
I think you are doing the detach too late: you are detaching the output of the discriminator for the fake images. The point is you want to detach the fake images before you feed them to the discriminator. The way you are doing it, the discriminator doesn’t have gradients for the fake case, which is not what you want, right?
Thanks for the prompt response, @paulinpaloalto! The explanation makes complete sense - however, even when I try detaching the fake images before I feed them to the discriminator, I still see the same error at runtime. What might I be doing wrong?
fake = gen(noise).detach()
pred_fake = disc(fake)
Well, how about trying to detach the fake images after the generator and before you feed them to the discriminator? That’s how I implemented it and it worked for me.
But I would have expected your solution not to throw that particular error. The other thing to keep in mind that just typing new code into a function and then calling it again does nothing: it still runs the old code. You have to click “Shift-Enter” on the actual function cell to execute the new code and get it generated into the runtime image.
It might be something else, in that case. I’ll keep trying. Thanks for your help @paulinpaloalto!