Well, it’s not that it works differently in the max-pooling case. It’s the fact that it works differently when stride != 1
. “Same” padding computes the padding to give the same output size if and only if stride = 1
. If the stride > 1, then you do not end up with the same output size.
Here’s a thread which discusses this point in more detail. Note that it also applies to transposed convolutions as well as shown on this thread (read from the linked post to the end of the thread). The second thread gives concrete examples of the padding results when stride > 1
.