I am having an issue in the function ‘get_gen_loss’ regarding the tensor dimensions. The issue is while I am using the same ‘disc’ function here (as the one I am using in the ‘get_disc_loss’, and it works properly there), the function here is resulting in [10, 64] tensors instead of [10,1].
Here is my code and I am really struggling what is the problem. Thanks
{moderator edit - solution code removed}
I added print statements to my get_gen_loss
code and here’s what I see when I run the test cell:
fakes.shape torch.Size([10, 64])
fake_pred.shape torch.Size([10, 64])
fakes.shape torch.Size([10, 64])
fake_pred.shape torch.Size([10, 64])
fakes.shape torch.Size([18, 784])
fake_pred.shape torch.Size([18, 1])
Success!
So whatever the problem is, it looks like it’s not the dimensions of the output. Note that there are several different test cases and not all of them have the same dimensions. Also note that the test cases here are pretty complicated and that the disc function is an argument to get_gen_loss
. In some of the test cases, they use nn.Identity
as the disc
function just because of the intent of the test.
I think the place to look is the way you are generating fake_labels
. You should just use ones_like
on the output of the discriminator with the fake images as input. That will give you the right shape and also take care of the device settings. If you print the shape of the fake_labels
that you are getting, that will be a clue about what is wrong.
Also note that you don’t want to detach the fake images in the case that you are training the generator. You need the gradients of the generator in this case.
Thank you very much for your response! I didn’t notice that I shouldn’t include a .detach() function in the ‘gen’ model’s training process. I removed it and know my model is training!
That’s great news! Here’s a thread from a while back that explains the detach issue in a bit more detail.