# Does the size of the filter have to be the same as how every many channels in the previous image?

In this picture f2 is 5 but would it not have to be 10 to match the number of channels in a1 ?

Well the size of the filter (also known as the kernel) in a convolutional layer does not have to be the same as the number of channels. The size of the filter and the number of channels in the previous layer are independent parameters in a convolutional neural network (CNN).

In an earlier video it said that they had to be equal. I might be misunderstanding something though.

I would think that if consistent with this diagram showing th 6x6x3 * 3x3x3 then since since a1 is 37x37x10â€¦ then f2 should be 10 since it is the size of the filter ?

Well the â€śnumber of channelsâ€ť in the input and the â€śnumber of channelsâ€ť in the filters (kernels) should typically be the same for proper convolutional operations. This ensures that each filter can operate on all channels of the input data and thatâ€™s what picture number 2 shows.

However, the â€śnumber of filtersâ€ť can indeed be different. You can use a different number of filters to extract different features from the input. Each filter is responsible for capturing specific patterns or features, and having a variety of filters allows a convolutional layer to learn a diverse set of features and thatâ€™s why you get different numbers at first image so â€ś10â€ť and â€ś20â€ť at first image not the number of channels but number of filters.

For example, in a convolutional layer, you might have:

• Input Image: (Height, Width, Number of Channels) e.g., (64, 64, 3) for an RGB image.
• Convolutional Filters: (Filter Height, Filter Width, Number of Input Channels, Number of Output Channels) e.g., (3, 3, 3, 64).

In this case, you have 64 filters, each with a depth of 3 to match the inputâ€™s 3 channels. Each filter produces one channel in the output feature map, resulting in an output with 64 channels.

I hope it makes sense now and feel free to ask for more clarifications.
Cheers!,
Jamal

yes that makes perfect sense ! so in the original question I sent in the screenshot shouldnt the f2 be equal to 10 instead of 5 ?

Well 5 here is the size of the kernel at f1 itâ€™s 3x3 kernel size and at f2 itâ€™s 5x5 kernel size thatâ€™s why you got lower dimension â€ś17x17x20â€ť after applying the formula with stride equals 2

I will try to make it clear for you.

1. Kernel Size (Filter Size):

• The â€śkernel sizeâ€ť or â€śfilter sizeâ€ť refers to the dimensions of the convolutional filter (kernel) used in a convolutional layer.
• It determines how many pixels the filter considers at a time when sliding over the input.
• Common kernel sizes are 3x3, 5x5, or 7x7, and they are specified as (height, width).
2. Number of Channels:

• The â€śnumber of channelsâ€ť represents the depth or the number of feature maps in the input data.
• In the context of an RGB image, there are typically three color channels: Red, Green, and Blue (RGB).
• For grayscale images, there is only one channel.
• In the input tensor, the number of channels is usually denoted as the last dimension (e.g., (Height, Width, Number of Channels)).
3. Number of Filters:

• The â€śnumber of filtersâ€ť (also known as â€śnumber of output channelsâ€ť) refers to how many individual convolutional filters are applied to the input.
• Each filter is responsible for learning a set of spatial patterns or features from the input.
• The number of filters determines the depth or the number of channels in the output feature map.
1 Like

Ohhhhâ€¦ So the f2 means 5x5 and the 10 (for the number of channels) is just automatically inputted as 5x5x10 for the size and then 20 is the number of total filters ?

f2 means 5x5 yeah and then you got 10 cause you already used 10 filters from previous layer and 20 at third layer is because we used 20 filter at layer 2 and so on

Thank you @Jamal022

Youâ€™re welcome!

Happy Learning!!!

1 Like