About Kernel Size Pool Size and Strides in Advace Computer vision week 3

sandeep_goshika · August 8, 2023, 5:05pm

Hello,
I have few questions regarding pool size, kernel size and strides. For suppose if we have image size of 64 x 64 x , what shoulde be the size of kernel, pool size and strides approximately.
and how output shape is defined in folloing image. what is [(None, 64, 84, 1)] formatted is 64, 84, 1 image size or filetr size or pool size ?

Jamal022 · August 8, 2023, 9:12pm

Hey @sandeep_goshika,
I will try to break your questions into points and i hope it will help you.

First let’s start with Kernel size: The kernel size, also known as the filter size, is typically a square matrix used to slide over the input image during the convolution operation. Common kernel sizes are 3x3, 5x5, and 7x7. The choice of kernel size depends on the complexity of the features you're trying to capture in the data.

So for your example in case (64 x 64) common kernel sizes could be 3x3 or 5x5.
always remember hyperparameters need you to experiment with it till you find best combination that fits your case.

Second Pool Size: Pooling layers are used to downsample the spatial dimensions of the input data while retaining important information.
Common pool sizes are 2x2 or 3x3 and again remember it needs experiment.

Okay maybe you are wondering how reducing spatial dimensions would help you.
The Answer: reducing the spatial dimensions of the data, can lead to a reduction in the number of parameters and computation.
So for your example in case (64x64) common pool size would be 2x2

Finally Strides: Strides determine the step size at which the convolution kernel or pooling window moves across the input data. A stride of 1 means the kernel or window moves one pixel at a time, while a stride of 2 means it moves two pixels at a time. Larger strides result in greater reduction in spatial dimensions. In many cases, a stride of 1 is used for convolutions, and a stride of 2 is used for pooling to achieve downsampling.

Always remember that you need to build your model as quickly as possiable then start to tune to improve it. So in case i say the common kernel size or pool size it doesn’t mean it will help you in your case but we just start with common techniques to build our model then start tuning

Let’s now break the formatted shape which is " [(None, 64, 84, 1)]":
None: Represents the batch size and it can vary depending on how many samples you're processing at once during training or inference

64: This represents the height of an image (vertical dimension)
84: This represents the width of an image (horizontal dimension)
1: This represents the number of channels in this case it's grayscale image

I hope it helps you and you understand now
Best Regards,
Jamal

Topic		Replies	Views
DLS course 4 image segmentation assignment max pooling 2D strides Convolutional Neural Networks coursera-platform	3	538	August 7, 2021
DLS C4W3 4th video clarification needed please Convolutional Neural Networks coursera-platform	7	507	June 10, 2024
Week3 quiz question Advanced Computer Vision with TensorFlow week-module-3	2	293	February 21, 2024
TF Course 2: Week 1 Quiz ambiguous question Convolutional Neural Networks in TensorFlow week-module-1	2	427	September 26, 2023
Convolutional layer output calculation Convolutional Neural Networks in TensorFlow week-module-1	5	438	October 1, 2023

About Kernel Size Pool Size and Strides in Advace Computer vision week 3

Related topics