Hi,
I am finding a little bit difficult to understand the different parameters to be used, in the generator block member function, specifically on the conv2Dtranpose.
I have looked into the torch documentation and on the paper for the DCGAN, but I just find everything very confusing. For instance, how do we choose the different parameters in a conv2Dtranspose and how does that affect the output size of the tensor.
For example, how do we achieve this architecture, what parameters should we use ?
Many thanks
Hello @Pablo_Prieto_Roca, Thank you for the question!
The formula to calculate ConvTranspose2d
output sizes is:
H_out = (H_in−1)*stride[0] − 2×padding[0] + dilation[0]×(kernel_size[0]−1) + output_padding[0] + 1
W_out = (Win−1)×stride[1] − 2×padding[1] + dilation[1]×(kernel_size[1]−1) + output_padding[1] + 1
By default, stride=1, padding=0, and output_padding=0.
Here is an example from this post:
nn.ConvTranspose2d(in_channels, out_channels, kernel_size=3, stride=2, padding=0)
the H_out will be:
H_out = (5-1)*2 - 2*0 + 1*(3-1) + 0 + 1 = 11
See this post for more explanations of ConvTranspose2d parameters.
Refs: