Hi,
This post is just to clarify the structure of the ExpandingBlock of the UNet model in Pix2Pix, the question in fact is:
In the video you talk about transpose convolutions, in the expanding part of the model, which is very logical, because i want to expand the image for create a nxn resolution image, but in the code you don’t use a transpose convolution instead you use a normal 2d convolution (nn.Conv2D):
class ExpandingBlock(nn.Module):
'''
ExpandingBlock Class:
Performs an upsampling, a convolution, a concatenation of its two inputs,
followed by two more convolutions with optional dropout
Values:
input_channels: the number of channels to expect from a given input
'''
def __init__(self, input_channels, use_dropout=False, use_bn=True):
super(ExpandingBlock, self).__init__()
self.upsample = nn.Upsample(scale_factor=2, mode='bilinear', align_corners=True)
self.conv1 = nn.Conv2d(input_channels, input_channels // 2, kernel_size=2)
self.conv2 = nn.Conv2d(input_channels, input_channels // 2, kernel_size=3, padding=1)
self.conv3 = nn.Conv2d(input_channels // 2, input_channels // 2, kernel_size=2, padding=1)
if use_bn:
self.batchnorm = nn.BatchNorm2d(input_channels // 2)
self.use_bn = use_bn
self.activation = nn.ReLU()
if use_dropout:
self.dropout = nn.Dropout()
self.use_dropout = use_dropout
So, i want to know exactly why i shouldn’t use a transpose convolution instead (nn.ConvTranspose2d)??
Thanks so much, i have learned a lot with these courses.