Understanding of Conv2D

Atom27 · July 10, 2023, 9:46pm

I don’t understand the first paramether in Conv2D function.
I’ve found an answer on stack overflow machine learning - What is the number of filter in CNN? - Stack Overflow but the thing I don’t understand is that if we have 64 filters, then why do we have 1 image in the output? Shouldn’t it return 64 images based on these 64 filters? Or is there a logic that picks 1 image that is more accurate and greater for the learning process out of these 64?

paulinpaloalto · July 10, 2023, 10:13pm

It sounds like the problem with the TF courses is that they assume you already understand the definitions of the various kinds of networks and how they work and then they are just showing you how to build those in TF. I would suggest that you might want to take DLS C1, C2 and C4 before you proceed here. In particular DLS C4 explains ConvNets.

Or go find some videos on YouTube that explain how ConvNets work. Here’s a quick sketch of how it works:

Suppose your inputs are RGB images of size 256 x 256 pixels. That means you also have 3 input color channels, so each input is a 3D tensor with shape 256 x 256 x 3. Now suppose you want to apply a Convolutional filter to that image. Let’s suppose we use a 5 x 5 filter with a stride of 1 and “valid” padding, meaning no padding. That means each filter is a 3D tensor of shape 5 x 5 x 3, because the channels of the filter need to match the channels of the input. Then you “step” that 5 x 5 x 3 filter across and down the image and you get an output size that is determined by this formula:

n_{out} = \displaystyle \lfloor \frac {n_{in} + 2p - f}{s} \rfloor + 1

That applies to both h and w, of course, so you get:

n_{out} = \displaystyle \lfloor \frac {256 + 2 * 0 - 5}{1} \rfloor + 1 = 251 + 1 = 252

So the output created by each individual filter will be 252 x 252 x 1. Then if you have 64 separate filters, the final result for each input image will be 252 x 252 x 64. You “stack” the outputs of the individual filters to form the full output. Of course there were a bunch of arbitrary choices we made there: the size of the filters, the stride, the padding and the number of total filters.

The other high level point here is that you just initialize all those filters randomly for symmetry breaking and then they learn whatever they need to learn through back propagation. Because they all start out different, they will also (with high probability) learn different things. Of course this is just one “conv” layer. You then need to compose an entire network model, which will probably involve several conv layers with pooling layers and perhaps some fully connected layers at the end, depending on what your goals are.

Topic		Replies	Views
Number of Filters in Convolution 2D Introduction to TF for Artificial Intelligence ... week-module-4	1	546	November 18, 2022
What's meaning of 6 in CONV 1 28x28x6 as screenshot mentioned? Introduction to TF for Artificial Intelligence ... week-module-3	3	537	March 22, 2022
Confusion in understanding the filter size and the number of filters Convolutional Neural Networks week-module-1 , coursera-platform	1	22	November 30, 2024
Number of images created by convolution layer Introduction to TF for Artificial Intelligence ... week-module-2	4	522	April 19, 2022
Question about channel dimension matching Convolutional Neural Networks coursera-platform	4	382	September 4, 2023

Understanding of Conv2D

Related topics