In the Answer of the link,the Mentor said that same padding is used in order not to shrink the volume during the upsampling. I understand that this is true to the conv2d’s padding in the upsampling block,but is it also true to the conv2Dtranspose’s padding? I think in the transpose convolution,we can’t get the smaller output than the input tensor,So why the “same” padding is used in the conv2dtranspose?
Your understanding is correct. Conv2DTranspose output can’t be smaller than input. As far as padding parameter goes, this link should help understand how output size is a scaled version of the input dimension.
I was running into the same question too, and from a quick search, I was surprised by the lack of thorough explanation of what this is. Here I’m trying my best:
On tensorflow doc, “same” results in padding evenly to the left/right or up/down of the input. When padding=“same” and strides=1, the output has the same size as the input.
Normally for transpose convolution, we have output_size = stride * (input_size - 1) + Kernel - 2 * padding. As a side note, padding controls how much space is added around the output. There is some further explanation here.
This is inconsistent with what I observed. I ran an experiment below. The behavior I saw was it’s trying to keep output_size = stride * input_size
Experiment:
import tensorflow as tf
t = tf.constant([
[
[0, 1],
[5, 6],
]
], dtype=tf.float32)
t = tf.expand_dims(t, axis=-1) # Now t has shape (1, 2, 2, 1)
# Needs 4d input, channels_last is chosen. That's why we need an input of (1,2,2,1)
conv = tf.keras.layers.Conv2DTranspose(
filters=1,
kernel_size=3,
strides=(2, 2),
padding='same',
data_format='channels_last',
dilation_rate=(1, 1),
activation=None,
use_bias=False,
kernel_initializer='he_normal',
kernel_regularizer=None,
bias_regularizer=None,
activity_regularizer=None,
kernel_constraint=None,
bias_constraint=None,
)(t)
conv.shape
I see (1,4,4,1) as output. I tried inputs with other shapes, such as (2,3), (3,3), the behavior output_size = stride * input_size is consistent. I was inspired by ChatGPT o1-preview, so please do point out if there’s any incorrectness here (I couldn’t find more detailed explanation on this though)
@ai_curious I saw a nice response from you in 2021. As of 2024, the output padding part has been removed from the TensorFlow API