The horiz_start and vert_start on the top left corner are 0. So, if stride = 1 and horiz_stride & vert_start = 0, our starting point will be the top left corner.
I donāt think so. Whatās the gain then, if sliding window is not moving?
You can do it either way. You can think of multiplication as just successive addition. 1 + 1 + 1 = 3 or 1 * 3 = 3, right? It just seems a bit cleaner to do it with multiplication. Otherwise it takes another line of code to set vert_start
to 0 outside the loop, right?
That was covered in the lectures. The filter steps across the image starting in the upper left corner, which is (0,0) in the normal way image coordinates work. Of course the fundamental thing here is that all indexing is ā0-basedā in python. Where else would you start? 1 is the not the first position.
exactly. So, I donāt see any reasons why we canāt only add the stride. And if it is zero, it will do only 1 step.
for example with 1 is ok, but when we get index 5 for example and want to do stride 3? 5 * 3 = 15. This is not 3 steps more from 5 as was in lecture.
I donāt understand your point. We need to keep clear on several things:
The loops here are over the output space and we must not skip any positions in the output space.
Then we compute where that output comes from in the input space. That is where the stride happens. Then we also have to remember the filter size and the size of height and width dimensions of the image.
So what is the stride and what is the h index in your 3 * 5 = 15 example? If the stride is 5, then the successive positions in the input will be:
0, 5, 10, 15 ā¦
How many such steps depends on the size of the image, of course.
h is 5. Stride is 3. So, in second iteration a start will be from 15 index. And as I understood correct concept of a stride, we should start second iteration from 5 + 3 = 8 of h, isnāt it?
This may be just a language issue and Iām not understanding your point, but no the second iteration would not start at 15.
Letās make this more concrete. Letās say that the vertical size of our input image is 20 and the stride is 3. Then here are the vert_start values on various iterations:
h = 0, vert_start = 0
h = 1, vert_start = 3
h = 2, vert_start = 6
h = 3, vert_start = 9
h = 4, vert_start = 12
So you can see that you can compute vert_start
as either h * stride
or you can keep it as a running sum and add the stride value on each iteration.
Note also that h there is the index into the vertical dimension of the output space and vert_start
is the index into the vertical dimension of the input image.
how did we get
h = 3, vert_start = 6
3 * 3 = 9 isnāt it?
Sorry!!! That was just a typo on my part. I skipped one h value and that threw everything off.
I have fixed it and now it should be consistent.
oh, I understood, thx. So, we have to account previous strides in indexes. By this reason we canāt only plus stride to given index.
I made the point earlier that you can either use multiplication or addition for handling the stride. Watch this:
stride = 3
print("Using multiplication")
for h in range(4):
vert_start = h * stride
print(f"h {h}, vert_start {vert_start}")
print("Using addition")
vert_start = 0
for h in range(4):
print(f"h {h}, vert_start {vert_start}")
vert_start = vert_start + stride
Running that gives these results:
Using multiplication
h 0, vert_start 0
h 1, vert_start 3
h 2, vert_start 6
h 3, vert_start 9
Using addition
h 0, vert_start 0
h 1, vert_start 3
h 2, vert_start 6
h 3, vert_start 9
So either method works, but I think the multiplication method is simpler to write and clearer code.
I got this. I thought @someone555777 is asking about vert_start = stride * h
and vert_start = stride + h
.
Hello there,
I am facing an issue with the same topic.
This is my code
for c in range(n_C):
a_slice_prev = a_prev_pad[vert_start:vert_end, horiz_start:horiz_end, :]
weights = W[:,:,:,c]
biases = b[:,:,:,c]
Z[i, h, w, c] = conv_single_step(a_slice_prev, weights, biases)
assert(Z.shape == (m, n_H, n_W, n_C))
in
8 Z, cache_conv = conv_forward(A_prev, W, b, hparameters)
9 z_mean = np.mean(Z)
ā> 10 z_0_2_1 = Z[0, 2, 1]
11 cache_0_1_2_3 = cache_conv[0][1][2][3]
12 print(āZās mean =\nā, z_mean)
IndexError: index 2 is out of bounds for axis 1 with size 2
I am getting the same error every time. I have been trying for the last 2 days to debug this error. Kindly help.
What is the shape of the Z value that is returned by your conv_forward
function in that case?
I added some print statements to my code and hereās what I see when I run that set of test cases:
stride 2 pad 1
New dimensions = 3 by 4
Shape Z = (2, 3, 4, 8)
Shape A_prev_pad = (2, 7, 9, 4)
Z[0,0,0,0] = -2.651123629553914
Z[1,2,3,7] = 0.4427056509973153
Z's mean =
0.5511276474566768
Z[0,2,1] =
[-2.17796037 8.07171329 -0.5772704 3.36286738 4.48113645 -2.89198428
10.99288867 3.03171932]
cache_conv[0][1][2][3] =
[-1.1191154 1.9560789 -0.3264995 -1.34267579]
First Test: All tests passed!
stride 1 pad 3
New dimensions = 9 by 11
Shape Z = (2, 9, 11, 8)
Shape A_prev_pad = (2, 11, 13, 4)
Z[0,0,0,0] = 1.4306973717089302
Z[1,8,10,7] = -0.6695027738712113
stride 2 pad 0
New dimensions = 2 by 3
Shape Z = (2, 2, 3, 8)
Shape A_prev_pad = (2, 5, 7, 4)
Z[0,0,0,0] = 8.430161780192094
Z[1,1,2,7] = -0.2674960203423288
stride 1 pad 6
New dimensions = 13 by 15
Shape Z = (2, 13, 15, 8)
Shape A_prev_pad = (2, 17, 19, 4)
Z[0,0,0,0] = 0.5619706599772282
Z[1,12,14,7] = -1.622674822605305
Second Test: All tests passed!
(2, 2, 3, 8)
(2, 2, 3, 8)
(2, 2, 3, 8)
IndexError Traceback (most recent call last)
in
8 Z, cache_conv = conv_forward(A_prev, W, b, hparameters)
9 z_mean = np.mean(Z)
ā> 10 z_0_2_1 = Z[0, 2, 1]
11 cache_0_1_2_3 = cache_conv[0][1][2][3]
12 print(āZās mean =\nā, z_mean)
IndexError: index 2 is out of bounds for axis 1 with size 2
Shape of Z is (2,2,3,8)
Ok, that shows where the problem is. In the first test case the shape of Z should 2 x 3 x 4 x 8, as you can see from the printouts that I showed.
So why is your Z the wrong shape?
You can see the shape of A_prev
2 x 5 x 7 x 4, and pad = 1
, so A_prev_pad
is 2 x 7 x 9 x 4.
We have stride = 2
, pad = 1
, f = 3, and nH_{prev} = 5 and nW_{prev} = 7
The formula is:
n_{out} = \displaystyle \lfloor \frac {n_{in} + 2p - f}{s} \rfloor + 1
So for nH_{out}, we have:
nH_{out} = \displaystyle \lfloor \frac {5 + 2 * 1 - 3}{2} \rfloor + 1 = \displaystyle \lfloor \frac {4}{2} \rfloor + 1 = 3
So how did you end up with 2 as that dimension?
Thanks a lot!! Corrected the code.
Silly mistakes, anyway thanks for the help Paul and Boubacar too.