Week 1 exercise 2 Maxpool2D padding

Hi there,
In the function “convolutional_model”, I just followed the instructions to inplement Max Pooling functions with same padding. However, the layer information is like below:

Model: “functional_1”

Layer (type) Output Shape Param #

input_3 (InputLayer) [(None, 64, 64, 3)] 0

conv2d_4 (Conv2D) (None, 64, 64, 8) 392

re_lu_3 (ReLU) (None, 64, 64, 8) 0

max_pooling2d_3 (MaxPooling2 (None, 8, 8, 8) 0

conv2d_5 (Conv2D) (None, 8, 8, 16) 528

re_lu_4 (ReLU) (None, 8, 8, 16) 0

max_pooling2d_4 (MaxPooling2 (None, 2, 2, 16) 0

flatten_1 (Flatten) (None, 64) 0

dense_1 (Dense) (None, 6) 390

Total params: 1,310
Trainable params: 1,310
Non-trainable params: 0

All tests passed!

I’m wondering why the n_H and n_W shrinked after the MaxPool2D functions with same padding?

It’s a good point that this seems a little surprising given the usual meaning of “same” padding for Conv layers. But the point of pooling layers is “downsampling” to reduce the height and width. The documentation for MaxPooling2D (the v1 compat version) does not discuss the meaning of the padding argument, but the MaxPool2D (the official v2 version of the layer) documentation does. Here’s what it says:

  • The resulting output, when using the "valid" padding option, has a spatial shape (number of rows or columns) of: output_shape = math.floor((input_shape - pool_size) / strides) + 1 (when input_shape >= pool_size )

  • The resulting output shape when using the "same" padding option is: output_shape = math.floor((input_shape - 1) / strides) + 1

In our case here, the strides are always > 1 (== pool_size), so the dimensions will be reduced. So the bottom line is that “same” padding for pooling layers doesn’t mean the same thing as it does for Conv layers, but you’ll notice that you would get the same dimensions in the case that stride == 1.

Actually I am wrong about that: try adding a non-trivial stride in a Conv2D layer with padding = ‘same’ and you’ll see that the size also gets reduced. So the TF interpretation of ‘same’ padding only gives the same output size if the stride is 1.

Thank you. It helps a lot.

Then why use padding at all in the MaxPool2d layer?
I trained the model without padding in both max pooling layers and got better accuracy on both training and test sets.
I assume then that tf pads the input with one extra row and column? but this doesn’t better the model. i.e. using more information from the edges doesn’t improve out model, maybe because the hand signs in the pictures are centered so there is no valuable information on the edges?