Conv2DTranspose Padding

As advised, I am opening a new thread about the padding parameter of the Conv2DTranspose layer following the previous thread here. My question is:

It would be cool, if the lecture videos showed an example of “valid” and “same” Conv2DTranspose Operations, because I also don’t understand what the keras documentation means by:

”same" results in padding with zeros evenly to the left/right or up/down of the input such that output has the same height/width dimension as the input

For example in the week 3 programming assignment Image_segmentation_Unet_v2, in the function upsampling_block, the shape of the input to the first Conv2DTranspose layer, i.e. expansive_input, is (None, 12, 16, 256) and the output of that layer, i.e. up, has shape (None, 24, 32, 32), which is obviously a different shape than the input, despite using padding="same".

So, how am I to interpret the keras documentation?

Ahh, it’s because we set stride=(2, 2), right? Because if I were to set stride=(1, 1), then the height and width of up is the same as the width of expansive_input

However, I am still confused when padding=“same”. In the keras documentation is says that the input is padded so that the output has the same dimensions as the input. However, in the video that Andrew explains about the Conv2DTranspose operation, he is padding the output with p=1.

Yes, this is a bit unfortunate in terms of the terminology, but that is the way TF/Keras does it. In that context, “same” padding means you get the same output shape only in the stride = 1 case. Otherwise you get the amount of padding that would be required to get the same dimensions with stride = 1 and then whatever the output shape is will be determined by the actual stride value you have with that amount of padding and the rest of the parameters.

The above applies for both normal convolutions and transposed convolutions, although the actual calculations are different in the two cases. At least they are consistent that way. :nerd_face: