When we use nn.ConvTranspose2d to build the Generator, we have such code:
And then run the test code
I know this works. But given we have 100 1D noise vectors at the input, shouldn’t we be using ConvTranspose1D? What am I missing?
When we use nn.ConvTranspose2d to build the Generator, we have such code:
And then run the test code
I know this works. But given we have 100 1D noise vectors at the input, shouldn’t we be using ConvTranspose1D? What am I missing?
The transpose convolutions are not being applied directly to the “noise”, right? The noise is used to create a small image and then we use the transpose convolutions to “upsample” to create larger output images, which is what the generator needs to do. The point of 1D versus 2D versus 3D for both forward and transpose convolutions is how many “spatial” dimensions you are dealing with. Here we are creating images, which have two spatial dimensions and then the “color” dimension with 3 “channels” for R, G and B. So that is why we need a 2D transpose convolution: we are doing the convolution operation on the height and width spatial dimensions.
Just to continue the explanation, a 3D convolution (either forward or transpose) is what you would use with volumetric data, e.g. medical images like CT scans, where you have 3 spatial dimensions (height, width and depth).
BTW you filed this under “General Discussion”, but it looks like you are talking about material in the GANs Specialization Course 1 Week 2. If I’ve got that right, it would be a nice idea to move this thread to the appropriate subtopic. You can edit that using the little “edit pencil” on the title. Or let me know if I’ve got the target course right and I can move it for you.