C4-CNN-Week3 quiz , question 9 question about transpose convolution "shape"


Hello teaching team, I know the “transpose convolution” is like projecting a scaled version of the kernel onto an output grid at every input location.

And according to What is Transposed Convolutional Layer? - GeeksforGeeks

the output size of transpose convolution = ( Input size - 1 ) x stride + Kernel size - 2 x padding

In the case of question 9, the output shape = (2 - 1) x 2 + 3 - 2 x 1 = 3.

but the provided answer is a grid of size 4 x 4 ??

Thanks

The size of a transpose convolution is ambiguous, meaning there is not a universally agreed upon formula for the output size. The problem is that on a forward convolution, there can be multiple input sizes that give you the same output size. Here’s a thread from mentor Raymond that explains this in detail.

Note that we need to treat quiz questions and answers the same way we do solution code, so I will “unlist” this thread. That means only the original author and the mentors can see it, not anyone on the forum.