Here’s my experiment for forward convolutions. I used the dataset in the U-Net exercise in C4 Week 3:
# Cell to experiment with padding on Conv2d
print("padding = 'valid' stride = 1")
playmodel = tf.keras.Sequential()
playmodel.add(Conv2D(filters = 3, kernel_size = 3, padding = 'valid'))
for playimage, _ in processed_image_ds.take(1):
playimage = tf.expand_dims(playimage, 0)
print(f"playimage.shape {playimage.shape}")
playout = playmodel(playimage)
print(f"playout.shape {playout.shape}")
print("padding = 'valid' stride = 2")
playmodel = tf.keras.Sequential()
playmodel.add(Conv2D(filters = 3, kernel_size = 3, strides = 2, padding = 'valid'))
for playimage, _ in processed_image_ds.take(1):
playimage = tf.expand_dims(playimage, 0)
print(f"playimage.shape {playimage.shape}")
playout = playmodel(playimage)
print(f"playout.shape {playout.shape}")
print("padding = 'same' stride = 1")
playmodel = tf.keras.Sequential()
playmodel.add(Conv2D(filters = 3, kernel_size = 3, padding = 'same'))
for playimage, _ in processed_image_ds.take(1):
playimage = tf.expand_dims(playimage, 0)
print(f"playimage.shape {playimage.shape}")
playout = playmodel(playimage)
print(f"playout.shape {playout.shape}")
print("padding = 'same' stride = 2")
playmodel = tf.keras.Sequential()
playmodel.add(Conv2D(filters = 3, kernel_size = 3, strides = 2, padding = 'same'))
for playimage, _ in processed_image_ds.take(1):
playimage = tf.expand_dims(playimage, 0)
print(f"playimage.shape {playimage.shape}")
playout = playmodel(playimage)
print(f"playout.shape {playout.shape}")
print("padding = 'same' stride = 3")
playmodel = tf.keras.Sequential()
playmodel.add(Conv2D(filters = 3, kernel_size = 3, strides = 3, padding = 'same'))
for playimage, _ in processed_image_ds.take(1):
playimage = tf.expand_dims(playimage, 0)
print(f"playimage.shape {playimage.shape}")
playout = playmodel(playimage)
print(f"playout.shape {playout.shape}")
Running that gives this result:
padding = 'valid' stride = 1
playimage.shape (1, 96, 128, 3)
playout.shape (1, 94, 126, 3)
padding = 'valid' stride = 2
playimage.shape (1, 96, 128, 3)
playout.shape (1, 47, 63, 3)
padding = 'same' stride = 1
playimage.shape (1, 96, 128, 3)
playout.shape (1, 96, 128, 3)
padding = 'same' stride = 2
playimage.shape (1, 96, 128, 3)
playout.shape (1, 48, 64, 3)
padding = 'same' stride = 3
playimage.shape (1, 96, 128, 3)
playout.shape (1, 32, 43, 3)
For forward convolutions, the formula is:
n_{out} = \displaystyle \lfloor \frac {n_{in} + 2p - f}{s} \rfloor + 1
So you can see that what they do in the “same” padding case is solve for p assuming always that stride = 1 and then use the resulting padding value (p = 1 in the examples above) regardless of what the requested stride actually is. As a result “same” padding only gives the same output shape in the case that the actual stride really is 1. For any stride > 1, the output is smaller.
Stay tuned while I try the analogous thing with transposed convolutions.