What is the meaning of ‘same’ padding for transpose convolution?
@Meir
Convolution involves applying a filter to an area and shift the filter and repeat the process to cover the entire sample. Without padding, depending on the filter size, the output of this operation can be smaller than the original input size. The larger the filter size the smaller the output gets.
(if you draw out a diagram of an image and a filter and try to move it around, it may give you better intuition.)
This can be an issue if you want the size of the output to be the same as the input. To prevent this, you can pad the input data to make it artificially bigger so the output size will be the same as the original input.
You can review the material below to get more intuition.
I hope that helps.
Suki
The question was about transpose convolution…
A late reply, but hope it helps someone.
Intuitively, it means the opposite of ‘same’ for convolution operations. If for convolutions, ‘same’ padding increased the output dimensions to match the input dimension, for transpose convolution it will reduce the 1D/2D output tensor dimension.
For example, the output dimension for a 2x2 input transpose convolved with a kernel of size 2 and stride equal to 1 will be 3x3 with ‘valid’ padding. However, with ‘same’ padding it will be 2x2. This helps in easy up-scaling and down-scaling since transpose convolutions then essentially reverse the convolution operation when used with the same configurations.
@Debabrata_Mandal provided a helpful reply. However, when I see a question like this in the forum I go look at the documentation. Sometimes it is so terse it isn’t helpful. In this case, though, it seems pretty straightforward. The Keras Conv2DTranspose layer accepts two padding-related parameters, as the input and output shapes can be padded independently.
padding : one of “valid” or ”same" (case-insensitive). ”valid" means no padding. ”same" results in padding with zeros evenly to the left/right or up/down of the input such that output has the same height/width dimension as the input.
output_padding : An integer or tuple/list of 2 integers, specifying the amount of padding along the height and width of the output tensor. Can be a single integer to specify the same value for all spatial dimensions. The amount of output padding along a given dimension must be lower than the stride along that same dimension. If set to None (default), the output shape is inferred.
The doc also provides a useful expression for seeing exactly how the output shape is formed:
new_rows = ((rows - 1) * strides[0] + kernel_size[0] - 2 * padding[0] +
output_padding[0])
new_cols = ((cols - 1) * strides[1] + kernel_size[1] - 2 * padding[1] +
output_padding[1])
See [Conv2DTranspose layer]
EDIT: this information is obsolete in 2024 due to revision of the API. output_padding
is no longer a parameter
It would be cool, if the lecture videos showed an example of “valid” and “same” Conv2DTranspose
Operations, because I also don’t understand what the keras documentation means by:
”same" results in padding with zeros evenly to the left/right or up/down of the input such that output has the same height/width dimension as the input
For example in the week 3 programming assignment Image_segmentation_Unet_v2, in the function upsampling_block
, the shape of the input to the first Conv2DTranspose layer
, i.e. expansive_input
, is (None, 12, 16, 256)
and the output of that layer, i.e. up
, has shape (None, 24, 32, 32)
, which is obviously a different shape than the input, despite using padding="same"
.
So, how am I to interpret the keras documentation?
This thread is 8 months old. Please start a new thread, so a currently active mentor will notice it and reply.