Dimension of output layer of max pooling in Inception Network

Abidhasan · August 23, 2021, 9:38pm

In the week2 Inception Network lecture, the output of max pooling operation on a volume of 28*28*192 is shown to be 28*28*32. But in pooling operation the depth of the input and the output remains unchanged.

Then how is it possible for the pooling operation to take an input of volume 28*28*192 and provide an output shaped 28*28*32.

neurogeek · August 24, 2021, 1:06pm

Hi @Abidhasan,

So, Prof. Ng actually addresses this in the lecture when talking about the Max pooling (also, these are just examples to explain the motivation for the Inception network, so they might be a bit strange). The quote is as follows:

Now in order to make all the dimensions match, you actually need to use padding for max pooling. So this is an unusual formal pooling because if you want the input to have a higher than 28 by 28 and have the output, you’ll match the dimension everything else also by 28 by 28, then you need to use the same padding as well as a stride of one for pooling.

So, with same padding and stride of 1, the output shape after applying the pooling[0] would be:

output_shape = math.floor((input_shape - 1) / strides) + 1

or:

output_shape = (28 - 1 / 1 ) + 1 = 28.

So you’ll end up with a 28x28x(number of filters, which in this case is 32) which would be 28x28x32.

Hope that helps!

[0] https://keras.io/api/layers/pooling_layers/max_pooling2d/

BrutalCaeser · September 24, 2021, 2:29pm

Hi in the example we have an input layer of 28X28X192, in order to preserve the dimensions we apply zero padding thats true by making stride=1 and but how does this affect the number of channels. They will remain same since max pooling is applied on 192 channels of input so we get a 28X28X192 max pooled layer and not 28X28X32 layer.

Ashish_Siwach · May 2, 2022, 2:25pm

Dimensions of output of max pool are 28X28X32 because we are using 32 filters as number of channels in the output are equal to the number of filters/kernels we use.

Csaba_Aszalos · July 4, 2022, 9:11am

There is a Clarifications reading section before the video. This clarification should be added there with a note that later in the video(s) there is a note that 1x1 CONV is used to reduce the number of channels to 32 after applying MAX-POOL

Dotan_K · November 15, 2022, 8:56pm

Hi,
I have the same issue as @Abidhasan, so I’ll try to clarify: the problem is not with the width or height. The problem is with the depth!

As far as I learned, max pooling doesn’t change the depth.
It doesn’t work ‘volumetricly’ like convolution filters. It works on each depth channel separately. Therefore, if the input has depth=192, so the output has depth=192.
Therefore, output depth can’t be number of filter as in convolution filters.

paulinpaloalto · November 16, 2022, 12:15am

The answer is in that reply from @Csaba_Aszalos that you quoted:

In other words, you’re exactly right about how max pooling layers work, but it’s not a pure max pooling layer there: it’s a max pooling layer followed by a 1 x 1 convolution to reduce the number of channels.

This is explained by Prof Ng in the second video on Inception Networks starting at about 2:00 into the lecture.

Topic		Replies	Views
MaxPooling in Inception model Convolutional Neural Networks coursera-platform	1	548	May 29, 2021
Inception Network Architecture Convolutional Neural Networks week-module-2 , coursera-platform	5	236	May 13, 2024
Question about Course 4 <Inception network motivation> Convolutional Neural Networks week-module-2 , coursera-platform	3	264	February 1, 2024
Max pool output in inception network Convolutional Neural Networks coursera-platform	1	422	July 8, 2023
Inception Model video Convolutional Neural Networks coursera-platform	1	526	May 20, 2022

Dimension of output layer of max pooling in Inception Network

Related topics