DLS C4W3 4th video clarification needed please

Mun_Chung_Wong · June 19, 2023, 5:48am

Hi experts,
I need some clarfications on the 4th video titled " Convolutional Implementation of Sliding Windows":

At time 9:10 of the video, the left most input image size is 28x28x3, and Andrew mentions it’s
using strides=2. But after the first 5x5 CONV, the image is 24x24x16. Shouldn’t it be:
floor( (28 -5)/(strides=2) + 1 ) = floor(23/2 + 1 ) = 12 ?
The above video corresponds to course notes slide 13.
Also, around time of 10:02 of the video and course notes slide 14, how does input dimension
go from 28x28 to 16x16 with a 5x5 CONV? I must have missed something fundamental here?

Thanks,
MCW

rmwkwok · June 19, 2023, 6:31am

Hello @Mun_Chung_Wong,

Thank you for the clear references.

My following bases on the 16 x 16 x 3 input, but the same logic applies to your question on the 28 x 28 x 3 input.

To begin with, we keep in mind that that one slide has discussed two methods to predict an image:

method A: Out of the image, manually slice 14 x 14 one at a time, and do such slicing at a stride of 2, we will end up with 4 14x14 images and 4 predictions.
method B: Not to slice, and just put the 16x16 input into the model and it gives also 4 predictions as a result.

Note that only method A has that so-called stride 2. That is the stride for manual slicing.

Now, if you look at the above convolution, it is the result of applying method B. A filter of 5x5 converts 16 x 16 to 12 x 12. That’s it! It’s method B, so it is not method A, so there is no stride of 2.

Similarly, if the input is 28 x 28, then the filter converts it to 24 x 24. Method B, that’s it, no stride of 2.

There is a known error. Please check out the reading item right before that video.

Cheers,
Raymond

Mun_Chung_Wong · June 19, 2023, 6:49am

Thanks Raymond for the clear explanations.

And for my 2nd question sorry I just realized that additional correction link after posting my
questions.

Cheers,
MCW

rmwkwok · June 19, 2023, 7:08am

You are welcome, @Mun_Chung_Wong!

Cheers,
Raymond

rmwkwok · June 20, 2023, 12:03pm

4 posts were split to a new topic: C4 W3 Bounding Box Predictions - wrong assignments to bh and bw?

Jeffrey_Antony · June 9, 2024, 6:48am

@rmwkwok I didnt understand why Prof. said that the stride of 2 came because of MAX POOL 2x2. Could you explain that ?

Can I assume the stride 2 calcualtion as below ?

The input in 28x28 and output is 8x8 (skipping the channels for the time being).
Also filter size is 14x14 as we are striding on a 28x28 image using our orginal 14x14 input size ConvNet.
So n =28, f=14, output is 8. S is unknown.

The formula (without padding) for calculating output size is
\lfloor \frac{n-f}{s} {+1}\rfloor = ouput

Filling known values to compute S,

\lfloor \frac{28-14}{s}{+1} \rfloor = 8
rearranging terms
\lfloor \frac{28-14}{s} \rfloor = {8-1}
above can be simplified to
\lfloor \frac{14}{s} \rfloor = {7}
rearranging to calculate S
\lfloor \frac{14}{7} \rfloor = {s}
which gives S= 2.

Trying on the 16x16x3 image input also gives the stride S=2.

@Mun_Chung_Wong Thanks for asking this question. I had same in my mind.

rmwkwok · June 9, 2024, 10:58am

Hello, @Jeffrey_Antony,

To understand that remark of Andrew’s, the best way is to repeat the conversions below and change the max pooling’s stride to other values.

max-pooling stride	output	effective stride
2	8x8	2 (as you calculated)
3	4x4	4
4	2x2	anything from 8 to 14

The max-pooling’s stride is not always equal to the effective stride, so the lecture’s example is a beautiful coincidence.

Andrew’s remark is correct in the sense that the effective stride has to do with the max pooling’s stride (as shown in the table, otherwise the last column won’t change with the first column), but it was not establishing a quantitiative relation.

Cheers,
Raymond

Jeffrey_Antony · June 10, 2024, 4:31am

Thank you for the explanation.

Topic		Replies	Views
Convolutional Sliding window example only works for a stride of 2 Convolutional Neural Networks coursera-platform	4	591	January 7, 2024
Convolutional Implementation of Sliding Windows Convolutional Neural Networks coursera-platform	3	558	January 15, 2023
"Convolutional Implementation of Sliding Windows" Video Convolutional Neural Networks coursera-platform	3	548	December 15, 2022
Transpose convolution confusion Convolutional Neural Networks coursera-platform	2	529	January 10, 2023
DLS Course 4 Week 2 Exercise 1: 1x1 convolution with strides=2 Convolutional Neural Networks coursera-platform	3	597	February 20, 2024

DLS C4W3 4th video clarification needed please

Related topics