[Week 3] As 2 - Ex 2: upsampling_block() Why do we use padding 'same' in Conv2DTranspose layer? What is the difference between 'padding' and output_padding?

Why do we use padding ‘same’ in Conv2DTranspose layer? To my understanding, the idea of using padding is to maintain the dimension from the in put, in the output(as explained below in the reference from TensorFlow). This is the opposite of what we are trying do when Upsampling or when using transpose convolution.

What is the difference between ‘padding’ and output_padding?
Is my previous question related to this difference?

Could anybody please help and explain this difference in further detail?
Many thanks!

taken from tf.keras.layers.Conv2DTranspose  |  TensorFlow Core v2.4.1
one of "valid" or "same" (case-insensitive). "valid" means no padding. "same" results in padding evenly to the left/right or up/down of the input such that output has the same height/width dimension as the input.

An integer or tuple/list of 2 integers, specifying the amount of padding along the height and width of the output tensor. Can be a single integer to specify the same value for all spatial dimensions. The amount of output padding along a given dimension must be lower than the stride along that same dimension. If set to None (default), the output shape is inferred.

1 Like

Hi garrofederico,

Here’s my two cents.

The ‘same’ padding is used in order not to shrink the volume during upsampling (which would happen if ‘valid’ were used) and to maintain symmetry to the contracting process.

‘Padding’ is done to an input volume, whereas output_padding is done to the output volume. The reason for output_padding is explained in the video on Transpose Convolution. The output_padding serves to help upsampling the image.

If you want to dive deeper into this, you can also have a look at the Ronneberger et al. (2015) paper which you can find here: [1505.04597] U-Net: Convolutional Networks for Biomedical Image Segmentation