I had the same doubt Iratxe Moya posted in the Coursera discussion forum.
I am pasting the same here since it was left unanswered.
IM
Iratxe Moya
16 days ago
Hello, I have a doubt about the output shape of the convolutions. I understand that if I am introducing N images to the model each of 28x28 pixels, after the convolution I would have 64 subimages for each N images, one for each filter of the convolution. This makes sense when I only apply one convolutional layer, but I can’t see it clear when applying a second convolutional layer after the first one (and after its Pooling layer, of course).
Let’s imagine i have the following model:
model = tf.keras.Sequential([
tf.keras.layers.Conv2D(64, (3, 3), activation='relu', input_shape=(28, 28, 1)), # We are using 64 filters or 3x3 size, starting from some known good filters
tf.keras.layers.MaxPooling2D(2, 2), # We are taking the maximum value from each 4 pixel (2x2)
tf.keras.layers.Conv2D(42, (3, 3), activation='relu'),
tf.keras.layers.MaxPooling2D(2, 2),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(1024, activation=tf.nn.relu),
tf.keras.layers.Dense(10, activation=tf.nn.softmax) # Where 10 corresponds to the number of different kind of images we have
])
If I execute a model.summary(), I get the following:
Here I can see that after the second convolution, the output shape is not 2688 (64x42) as I would think at first, but only 42. So how are the filters of the second convolution applied. I have thought that maybe each filter on the second convolution is applied to each previous convoluted subimage and then made some kind of average, or something similar, but I cannot see it clear. Any help?