C1W1 UNQ_C6 Assertion Error without any information


AssertionError Traceback (most recent call last)
in
71 break
72
—> 73 test_disc_reasonable()
74 test_disc_loss()
75 print(“Success!”)

in test_disc_reasonable(num_images)
33 disc_loss = get_disc_loss(gen, disc, criterion, real, num_images, z_dim, ‘cpu’).mean()
34 disc_loss.backward()
—> 35 assert torch.isclose(torch.abs(disc.weight.grad.mean() - 11.25), torch.tensor(3.75))
36
37 def test_disc_loss(max_tests = 10):

AssertionError:

Hi @tmsaur , welcome back to the community.

If you send me your code of C6 in a direct message, I can take a look and provide a hint as to what’s going on.

Thanks,

Juan

1 Like

Hi @tmsaur ,

Since the generator is needed when calculating the discriminator’s loss, you will need to call .detach() on the generator result to ensure that only the discriminator is updated.

Sometimes we call .detach() on the wrong result.

Please evaluate this aspect of your implementation and see if you can modify it to fit the requirement.

Let me know how it goes!

Juan

1 Like

Juan, got it working. Thanks.

I feel I have to understand this at bit better.
This is what the detach() method do:

detach():

Sometimes, you want to calculate and use a tensor’s value without calculating its gradients. For example, if you have two models, A and B, and you want to directly optimize the parameters of A with respect to the output of B, without calculating the gradients through B, then you could feed the detached output of B to A. There are many reasons you might want to do this, including efficiency or cyclical dependencies (i.e. A depends on B depends on A).

So the reason for calling detach() here is that the Generator and Discriminator share the Tensor? And if you don’t do it you will calculate the loss on all of the Tensor’s content? But you only want the Discriminator’s part?

Yes, the reason for calling detach() here is because the result comes from the Generator and will be used by the Discriminator. If we don’t detach it then the Generator will be affected as well - the original returned tensor shares the graph of the Generator. The detach “breaks” the gradient connection with the Generator.