How does padding work in transpose convolution?

ZHONG_Yiyuan · January 9, 2023, 5:46pm

In the video, only the first row and the first column are removed. Why are the last row and last column kept?

In the example in this link, the first and last rows and columns are removed.

I am really confused about it

paulinpaloalto · January 10, 2023, 1:16am

I think you’re just misinterpreting the slides. All the grey pixels around the edges are ignored. It looks a bit asymmetric only because stride = 2 means you only get two horizontal operations: the third one would take you off the end.

ZHONG_Yiyuan · January 10, 2023, 1:24am

Then how do you determine the size of the output before the operation. In other words, why is the last grey column in the sixth column rather than the fifth column?

paulinpaloalto · January 10, 2023, 4:48am

That is the definition of how padding works. The formula for the dimensions of the output is given on this thread.

ZHONG_Yiyuan · January 10, 2023, 6:11am

Using the formula n_{out}=(n_{in}-1)\times s+f-2p, n_{in}=2, s=2, p=1, s=2, then n_{out}=(2-1)\times 2+3-2\times 1=3, so the output in this slide is wrong?

paulinpaloalto · January 10, 2023, 4:08pm

You may be right. More research needed …

I have not really looked carefully at the meaning of padding in transposed convolutions. I will try to find more information and let you know if I can find anything relevant.

rmwkwok · January 15, 2023, 7:53am

Hello @ZHONG_Yiyuan, and @paulinpaloalto,

I think both Andrew’s result and Paul’s formula are NOT wrong. There is some ambiguity built-in to this. If we look at this formula for computing the output size of the normal convolution:

The floor operation makes the following situation possible:

That different InputSize gives the same Output Size. Now, if we are to design the transposed convolution operation, and given an input image of size (2 x 2), should the operation return a (3 x 3) or a (4 x 4) matrix?

That is the ambiguity. There are two possibilities in my above example.

In short, the n_{out} = 3 that you calculated using Paul’s formula, and n_{out} = 4 that you see in Andrew’s video are two of the possibilities. You will further find this make sense if you do a normal convolution with an input (3 x 3), and another normal convolution with an input (4 x 4) using a (3 x 3) kernel, stride = 2, padding = 1, then you will find both result in a (2 x 2). Again, that is the ambiguity.

Now, the problem comes to: how to address this in an implementation of the transposed convolution? Pytorch uses the same equation as Paul’s as far as only stride, pad, kernel size, image size are concerned. Tensorflow has different equations depending on how you parameterize it.

Therefore, Andrew’s implementation results in what you see in the video. And you can implement your own that results in the way that you described in your first post. However, what is unchanged is that, we use the steps described by Andrew to do all those element-wise multiplication and then place the results in the right place in the matrix of which the output shape is pre-computed (by formula like Paul’s, Pytorch’s, Tensorflow’s, Andrew’s, or yours).

Lastly, for some reason I have spent some time on Transposed Convolution and thus written an article about it. If you are interested in understanding it from another angle, please feel free to read it.

Raymond

ZHONG_Yiyuan · January 15, 2023, 9:24am

Thanks for clarification. Your article is very helpful and clear.

ZHONG_Yiyuan · January 15, 2023, 9:32am

I found the way that Tensorflow infers the OutputSize when output_padding=None is difficult to understand.

rmwkwok · January 15, 2023, 9:38am

You are welcome @ZHONG_Yiyuan!

I am taking a walk now thinking what I should add to the article before I forget about it. Haha, the article helps me remember things.

As for the output padding, I didn’t look into that either. Although I am going to update that part of my article, I am not quite going to get into the details of that, perhaps unless I see any documentation about that by Tensorflow.

I am not quite interested in that because without more context, that output_padding parameter is nothing more than for adjusting the output shape. It doesn’t change the arithmetics which is the core. That parameter has impact, but it doesn’t look extremely important to me at this stage.

Perhaps we need to ask ourselves, who will use that parameter and for what purpose. I don’t have an answer to that, but when you have it, please share with me.

Cheers,
Raymond

rmwkwok · January 15, 2023, 2:50pm

Hello @ZHONG_Yiyuan, I have updated my article on output_padding for its purpose to address on that ambiguity, however, it is not going to be about output_padding=None.

Raymond

tarunsaxena1000 · June 14, 2024, 9:21pm

the way i see is in the normal convolution formula of output is given as-
[(n+2p-f)/s +1] where ”[“ ”]”is the greatest interger function.

hence putting the values of transpose convolution in above function we have

[(n+2 *1-3)/2 +1] =2

where p=1, s=2, we are treating the input 2x2 matrix of transpose convolution as the output of normal convolution in above formula

hence
[(n-1)/2]=1

therefor
1<= (n-1)/2 <2
2<= n-1 <4
3<= n <5

hence
n = 3 or 4
@rmwkwok @paulinpaloalto @ZHONG_Yiyuan

rmwkwok · June 15, 2024, 2:42am

Hello, @tarunsaxena1000,

Yes, either an input size of 3 or 4, as you calculated, may produce 2 as the output size!

Raymond

khteh · September 17, 2025, 2:58am

1<= (n-1)/2 <2 Why 1<= when it could be any number, -inf for instance?

khteh · September 17, 2025, 3:22am

Sorry, which article is being referred to here?

rmwkwok · September 17, 2025, 10:07am

Hello, @khteh, this one (this is a friend-link so it should require no login or whatever, to my understanding). It’s written by me 2.5 years ago. Let me know if you have any feedback, and I hope it’s not too poorly written…

I think I should add the link back to my post above.

Thanks for asking!

Raymond

Topic		Replies	Views
So confuse about output from transpose convolution in week3 Convolutional Neural Networks week-module-3 , coursera-platform	4	39	November 30, 2024
Formula for Getting o/p dimension in Transpose Convolution Convolutional Neural Networks coursera-platform	4	1252	April 20, 2023
Transpose convolution Convolutional Neural Networks coursera-platform	9	671	May 31, 2021
Formula for output shape of convolution and transpose convolution layer Convolutional Neural Networks week-module-3 , coursera-platform	3	66	August 21, 2024
Course 4, week 3, programming assignment 2: transpose convolution implementation Convolutional Neural Networks coursera-platform	4	554	August 25, 2022

How does padding work in transpose convolution?

Related topics