C4W3 video 5; Question about stride in Convolutional implementation of sliding window

Mohamed_Akram · September 4, 2023, 10:28am

What if the stride of the sliding window is different than that of the first convolution? and what if it is different than that of the pool layer? or should they always be the same? (first convolution is of stride 1 and max pool is of same stride of sliding window?)

Juan_Olano · September 4, 2023, 12:24pm

In CNN, the stride of the sliding window for the first convolutional layer, subsequent convolutional layers, and pooling layers can all be different. They don’t have to be the same.

Mohamed_Akram · September 4, 2023, 12:48pm

The video is about object detection using a sliding window and focuses on using convolution instead of the sliding window for detection. I apologize for the confusion regarding the detection sliding window.

If the stride of the max pool layer is 2 and the stride of the sliding window is also 2, then the case presented in the video is fine. However, if the max pool’s stride is different, what will happen? Will it still work? The same question applies to the stride of the first convolution layer. Will the algorithm work if the stride is not 1 for the first conv layer?

Jamal022 · September 4, 2023, 1:06pm

Hey @Mohamed_Akram,

As @Juan_Olano said the stride of the siliding window for the first conv layer and other layers they don’t have to be the same.

But let’s address your question:

Stride of Max Pool Layer Different from Sliding Window :

If the stride of the max-pooling layer is different from the stride of the sliding window, it can still work, but it will affect the spatial resolution of the feature maps.
A larger stride in the max-pooling layer (e.g., 2) compared to the sliding window (e.g., 1) will reduce the spatial dimensions of the feature maps. This means you’ll have less detailed feature maps to work with in subsequent layers, which may impact the algorithm’s ability to precisely localize objects and detect smaller objects.

Stride of First Convolution Layer Not 1 :

If the stride of the first convolutional layer is not 1, it will also impact the algorithm’s ability to capture fine-grained details and localize objects.
And as i mentioned above a larger stride in the first convolutional layer means that the initial feature maps will have reduced spatial resolution, which can make it more challenging to detect objects accurately, especially small or closely spaced objects.

In practice, many object detection architectures, typically start with a small stride (e.g., 1) in the first convolutional layer to capture fine-grained features and details.

I hope it make sense now.
Regards,
Jamal

Topic		Replies	Views
Varying the strides with Convolutional Implementation of Sliding Windows Convolutional Neural Networks coursera-platform	1	582	June 2, 2022
Convolutional Sliding window example only works for a stride of 2 Convolutional Neural Networks coursera-platform	4	643	January 7, 2024
DLS C4W3 4th video clarification needed please Convolutional Neural Networks coursera-platform	7	519	June 10, 2024
Week 3 Convolutional Implementation of Sliding Windows Convolutional Neural Networks coursera-platform	4	563	July 25, 2021
Sliding window question Convolutional Neural Networks coursera-platform	1	510	March 3, 2023

C4W3 video 5; Question about stride in Convolutional implementation of sliding window

Related topics