Question on Convolution Kernal Sizes

Jonathan_Kane · April 29, 2022, 5:53pm

In assigment one (week 2) you write:

Zero-padding pads the input with a pad of (3,3)
Stage 1:
- The 2D Convolution has 64 filters of shape (7,7) and uses a stride of (2,2).
- BatchNorm is applied to the ‘channels’ axis of the input.
- ReLU activation is applied.
- MaxPooling uses a (3,3) window and a (2,2) stride.
Stage 2:
- The convolutional block uses three sets of filters of size [64,64,256], “f” is 3, and “s” is 1.
- The 2 identity blocks use three sets of filters of size [64,64,256], and “f” is 3.
Stage 3:
- The convolutional block uses three sets of filters of size [128,128,512], “f” is 3 and “s” is 2.
- The 3 identity blocks use three sets of filters of size [128,128,512] and “f” is 3.

I passed the coding fine, and I understand the network architecture. What I don’t understand is why in stage 2 and stage 3, the filters are size 64, 64, 256 and then double. Isn’t the size (64, 64) too big for the actual size of the network at that point.

The actual layers size when we look at the model is:

None, 15, 15, 64)
(None, 15, 15, 64)
(None, 15, 15, 64)
(None, 15, 15, 256)
(None, 15, 15, 256)
(None, 15, 15, 256)
(None, 8, 8, 128)
(None, 8, 8, 128)
(None, 8, 8, 512)
(None, 8, 8, 512)
(None, 8, 8, 512)
(None, 4, 4, 256)
(None, 4, 4, 256)
(None, 4, 4, 1024)
(None, 4, 4, 1024)
(None, 4, 4, 1024)

So the last number follows the number of channels that we setup, but the layer size is decreasing. Except we are using 64, then 128, then 256.

Why do we specify the convolutional layer to have larger and larger size (64,64), then 128,128

Thanks in advance,

paulinpaloalto · April 29, 2022, 7:14pm

I think the issue is just that you are misinterpreting the meaning of that parameter that is specified as [64, 64, 256]. Those are not the sizes of the filters in the sense of f: those are the number of output channels for 3 different layers. So in other words in Stage 2, you invoke the convolutional_block function with [64, 64, 256] and f = 3, so you end up creating 3 separate convolutional layers with the following total “filter sizes”:

3 x 3 x 64
3 x 3 x 64
3 x 3 x 256

What you end up with as the actual output size after that you’ll have to compute by taking the stride, padding and input image sizes into account. That’s what you see in the actual layer sizes that you show later.

Take a look at what the logic in convolutional_block actually does with that input.

paulinpaloalto · April 29, 2022, 7:18pm

As to the question of why the number of channels goes up as you proceed through the network, that is the general way that ConvNets work. Prof Ng discusses this at a number of points in the lectures, but you can think of the process being that the geographical area is being reduced and “distilled” down to the detection of more and more features as you proceed through the network.

There is a really interesting lecture in Week 4 titled “What are Deep ConvNets Learning” in which Prof Ng shows and explains some really cool work that gives us a way to visualize what the inner layers of the network are actually detecting. Even if you haven’t gotten to Week 4 yet, I think that lecture would still make sense and is definitely worth a look either now or when you get there or both.

Jonathan_Kane · April 29, 2022, 7:38pm

Thank you, this was really helpful. I understand it now. I sometimes find a gap between the theory and what I understand of the code. I appreciate it!

Topic		Replies	Views
Does the size of the filter have to be the same as how every many channels in the previous image? Convolutional Neural Networks coursera-platform	15	1044	August 18, 2024
Confusion in understanding the filter size and the number of filters Convolutional Neural Networks week-module-1 , coursera-platform	1	56	November 30, 2024
U-Net conv_block Convolutional Neural Networks coursera-platform	2	724	July 28, 2021
VGG-16 confusion Convolutional Neural Networks coursera-platform	2	347	September 29, 2023
In start they say 6x6x3 image 3 is color channel but now they are telling 37x37x40 40 is filter size. anybody can elaborate this Convolutional Neural Networks coursera-platform	3	509	December 20, 2022

Question on Convolution Kernal Sizes

Related topics