Week4 Question to the padding of Conv2Dtranspose

nananana · August 5, 2021, 10:25am

I read the question in the following link.

In the Answer of the link,the Mentor said that same padding is used in order not to shrink the volume during the upsampling. I understand that this is true to the conv2d’s padding in the upsampling block,but is it also true to the conv2Dtranspose’s padding? I think in the transpose convolution,we can’t get the smaller output than the input tensor,So why the “same” padding is used in the conv2dtranspose?

balaji.ambresh · April 25, 2022, 6:26pm

Your understanding is correct. Conv2DTranspose output can’t be smaller than input. As far as padding parameter goes, this link should help understand how output size is a scaled version of the input dimension.

RicoRuotongJia · September 30, 2024, 11:56pm

I was running into the same question too, and from a quick search, I was surprised by the lack of thorough explanation of what this is. Here I’m trying my best:

On tensorflow doc, “same” results in padding evenly to the left/right or up/down of the input. When padding=“same” and strides=1, the output has the same size as the input.

Normally for transpose convolution, we have output_size = stride * (input_size - 1) + Kernel - 2 * padding. As a side note, padding controls how much space is added around the output. There is some further explanation here.
The doc says

“The output has the same size as the input”

This is inconsistent with what I observed. I ran an experiment below. The behavior I saw was it’s trying to keep output_size = stride * input_size

Experiment:

import tensorflow as tf

t = tf.constant([
    [
        [0, 1],
        [5, 6],
    ]
], dtype=tf.float32)

t = tf.expand_dims(t, axis=-1)  # Now t has shape (1, 2, 2, 1)

# Needs 4d input, channels_last is chosen. That's why we need an input of (1,2,2,1)
conv = tf.keras.layers.Conv2DTranspose(
    filters=1,
    kernel_size=3,
    strides=(2, 2),
    padding='same',
    data_format='channels_last',
    dilation_rate=(1, 1),
    activation=None,
    use_bias=False,
    kernel_initializer='he_normal',
    kernel_regularizer=None,
    bias_regularizer=None,
    activity_regularizer=None,
    kernel_constraint=None,
    bias_constraint=None,
)(t)
conv.shape

I see (1,4,4,1) as output. I tried inputs with other shapes, such as (2,3), (3,3), the behavior output_size = stride * input_size is consistent. I was inspired by ChatGPT o1-preview, so please do point out if there’s any incorrectness here (I couldn’t find more detailed explanation on this though)

@ai_curious I saw a nice response from you in 2021. As of 2024, the output padding part has been removed from the TensorFlow API

ai_curious · October 1, 2024, 10:31am

Agree that post is obsolete due to evolution of the API in the last 3 years. I have edited it to reflect the change.

Topic		Replies	Views
[Week 3] As 2 - Ex 2: upsampling_block() Why do we use padding 'same' in Conv2DTranspose layer? What is the difference between 'padding' and output_padding? Convolutional Neural Networks coursera-platform	1	674	April 22, 2021
What is 'same' padding for transpose convolution? Convolutional Neural Networks coursera-platform	7	1048	October 10, 2023
Conv2DTranspose Padding Convolutional Neural Networks coursera-platform	3	601	July 14, 2022
U-net upsampling block Convolutional Neural Networks week-module-3 , coursera-platform	2	786	January 24, 2024
Course 4, week 3, programming assignment 2: transpose convolution implementation Convolutional Neural Networks coursera-platform	4	552	August 25, 2022

Week4 Question to the padding of Conv2Dtranspose

Related topics