In the training step, I used “gen.forward” and “disc.forward” to run the inputs (forward pass) through the generator and discriminator nets respectively. But I get the following Attribute error when trying to run the “test_disc_reasonable” and test_gen_reasonable" functions. (I mean when I run the cells containing these functions used for grading.)
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Input In [38], in <cell line: 73>()
70 if num_steps >= max_tests:
71 break
---> 73 test_disc_reasonable()
74 test_disc_loss()
75 print("Success!")
Input In [38], in test_disc_reasonable(num_images)
11 criterion = torch.mul # Multiply
12 real = torch.ones(num_images, z_dim)
---> 13 disc_loss = get_disc_loss(gen, disc, criterion, real, num_images, z_dim, 'cpu')
14 assert torch.all(torch.abs(disc_loss.mean() - 0.5) < 1e-5)
16 gen = torch.ones_like
Input In [34], in get_disc_loss(gen, disc, criterion, real, num_images, z_dim, device)
20 # These are the steps you will need to complete:
21 # 1) Create noise vectors and generate a batch (num_images) of fake images.
22 # Make sure to pass the device argument to the noise.
(...)
33
34 # Prediction of fakes
35 noise_inputs = get_noise(num_images, z_dim, device)
---> 36 fakes = gen.forward(noise_inputs).detach()
37 fake_pred = disc.forward(fakes)
38 fake_ground_truth = torch.zeros(num_images, 1, device=device)
AttributeError: 'builtin_function_or_method' object has no attribute 'forward'
I looked into the function(test_disc_reasonable) and they implement “gen” object as a “torch.Tensor”, so of course it doesn’t have a “forward” attribute. These functions did “gen = torch.zeroes_like” and passed this to the “get_gen_loss()” function where gen is expected to be a Generator model. I am not supposed to change the testing functions since they are for grading.
Can someone please help me…Is there something that needs to be fixed in testing functions or am I wrong somewhere?
P.s: I ran the code skipping the cells containing these functions and the code seem to work fine. I get good results over time.
Hi Karthik_Dasaraju!
Welcome to the community .
Well, this is a tricky thing! I will try my best to explain. What you pointed out is correct. But the content they gave in the test_disc_reasonable is not in the context of GANs. They just want to “Verify shape, related to gen_noise parametrization”, some more, they are simply using some simple method to check those. So they are not performing it in the context of GAN’s logic. Whereas the test_disc_loss which is below it ,is in the context of a GAN’s logic. So, it is advised not to look at test functions (they need not be in accordance with the correct logic, they use simple tricks to check whether our implementation is correct or not). We have to look at the instructions given in the cell. They have mentioned this in the get_gen_loss function –
gen: the generator model, which returns an image given z-dimensional noise
So we have to do with respect to this. (You can also verify this in the main training loop(at the end of the notebook) where the actual training is implemented(all other test functions are just to check if we are implementing our helper functions correctly - they need not be in context). They would have initialized gen as gen = Generator(z_dim).to(device)
here.
Here if gen is an object of the Generator then both these implementations are correct to call the forward function :
- gen.forward(input_noise)
- gen(input_noise)
Here both are correct but the latter part(2) is most commonly used than the former part (1) because in PyTorch, gen.forward()
is an older way of calling the forward pass of a PyTorch module. While it still works, it is less flexible than using gen(input)
. When calling gen.forward()
, you cannot easily pass additional arguments to the forward pass (such as training=True
to enable dropout), and you cannot easily use GPU acceleration (since you cannot pass a tensor to the forward method).
On the other hand, calling gen(input)
allows you to pass arbitrary arguments to the forward pass, and it automatically moves tensors to the correct device (such as a GPU) if necessary. Additionally, if gen
is a subclass of nn.Module
, calling gen(input)
automatically registers the input tensor input
as a computational graph node, which enables PyTorch to automatically compute gradients during backpropagation.
And that’s why they might have given such a test function assuming that we would be using the (2) approach. If you use the (2) approach, you wouldn’t get this error.
Now, you might be thinking how this gen(input_noise)
calls forward method?
When you call gen(input_noise)
, it invokes the __call__
method of the Generator
instance gen
with the input input_noise
which in turn invokes the forward function . This is equivalent to calling gen.forward(fake_noise)
. The call method is not defined explicitly in the Generator class. But as you can see the Generator class is not a base class rather it is an inherited class ( inherited from base class torch.nn.Module). When the object gen was created it would have invoked the super constructor too ( initializes everything in the base class too ) and the bass class methods can also be called by this object. The call method is defined in the base class.
Also, ensure that you are detaching the generator while you are trying to get the discriminator’s prediction ( in the second step) not in the first step itself where you are trying to generate fake images using the generator (This is a conceptual mistake that I saw in your code).
Sorry for these long stories.Hope that I am able to help you understand what is happening here.
Regards,
Nithin
Thanks a lot Nithin. I understood everything and fixed my code. It doesn’t throw up an error now. Thanks for the explanation on implementation of gen(noise) opposed to calling the forward.
Regarding the usage of detach() you mentioned at the last, I think it should be in the first step i.e detach() should be called when outputs have been generated. gen(noise).detach()
. I tried your way. i.e disc(fake_images).detach()
but the code wasn’t working as intended. The generator seemed to not learn at all and I get “Runtime test case failed” printed out in some tests. I don’t exactly know the working of detach() perhaps you could message me about its implementation. I don’t mind long answers
Hi Karthik!
.detach()
is a PyTorch method that creates a new tensor that shares the same underlying data as the original tensor but is not attached to the computation graph. Gradients will not be backpropagated through the detached tensor to update the weights of the neural network during training. What I suggested to you, is to detach the fakes
tensor in the disc call as mentioned in the instructions (which are present in the cell) i.e. disc(fakes.detach()) and not disc(fakes).detach().
The difference between these two is this ::
In the former part, we first detach the fake image tensor from the computational graph by calling detach()
on it before passing it to the disc
model. This means that the gradients will not be computed or propagated through the generator model during the backward pass. As a result, only the discriminator model will be updated during the training process.
In the later part : We first pass the fake image tensor to the disc
model for evaluation and then detach the resulting tensor from the computational graph by calling detach()
on it. This means that the gradients will not be computed or propagated through the discriminator model during the backward pass. As a result, only the generator model will be updated during the training process.
The intention of this cell is to get the discriminator loss which is why the former is prescribed.
Also, this implementation (as you did before) will also work I guess, because it issimilar to the former part. I’m really sorry for creating this confusion, I didn’t go through your code properly at first, I just noticed it while I was explaining. You can go either way (this or the former part)
Regards,
Nithin