Conv_forward function, Week 1 Assignment 1

I’m getting this error message in the conv_forward function:
“Operands could not be broadcast together with shape (3,3) (3,4)” and I’m not sure why I can’t get the right dimensions on my a_slice_prev and weights

I’m not sure where I’m going wrong, but I have that

for h in range(0,n_H,stride)
vert_start = h
vert_end = h+f

I have the same thing for the width. My a_slice_prev assignment looks like:
a_slice_prev = a_prev_pad[vert_start:vert_end, horiz_start:horiz_end, :]

I’m not sure why the dimensions of the array are different from the weights matrix (weights = W[:,:,:,c])

Hi jmilch,

Thanks for your question.

Note that the assignment states

# loop over vertical axis of the output volume

and

# loop over horizontal axis of the output volume

So the values of vert_start, vert_end, horiz_start, and horiz_end refer to locations in the output volume. What does this mean for the values you use in selecting slices from the input volume, considering all the parameters that go into the convolution?

You may also want to check the lines of code in which you define the dimensions of the output volume. Look at the formulas relating the output shape of the convolution to the input shape. Did you implement these correctly?

Good luck!

1 Like

So am I making a mistake in writing the horizontal and vertical loops? I don’t understand why

vert_start = h
vert_end = h+f
horiz_start = w
horiz_end = w+f

doesn’t work. Shouldn’t those be the indices that I am slicing from a_prev_pad?

Also, I’m debugging with print statements and my code is working for a few loops and then for some reason one of the a_slice_prev matrices has dimensions (3,2,4) and the weights matrix has dimensions (3,3,4). All of the previous iterations before that had the a_slice_prev matrix being (3,3,4) so it worked when I called the conv_single_step function that I wrote earlier in the exercise.

I’m not sure why the dimensions of the matrix randomly change to (3,2,4) on a certain iteration but not the multiple iterations that I have performed before.

Hi jmilch,

Look again at the formulas relating the output shape of the convolution to the input shape (see Reminder: under the description of Exercise 3 - conv_forward).

Did you include all the relevant parameters into your loop variables? How could you handle the impact of the denominator in the formulas? What does the denominator mean for the slice to be taken from a_prev_pad?

I’m a little confused what you mean by the denominator. Do you mean the stride value from the formula of the output shape?

Let’s look at the stride you mention. How does the stride affect the relationship between the input volume and the output volume? Did you account for this in your coding?

Yes, when calculating n_W and n_H. I’m accounting for stride in my for loops by defining it as:

for h in range(0,n_H,stride) so each iteration increases h by amount stride.

Is this what you are talking about?

Hi jmilch,

By including the stride as a step in your loop you reduce the number of rows in the output volume you perform calculations for (you jump from row 0 to row 2 etc., skipping the odd rows). This is not what you want. Can you think of another way to handle the stride without skipping rows?

I could make the loop:

for h in range(n_H)
horiz_start = h
horiz_end = h+f
h += stride

Hi jmilch,

That boils down to the same :neutral_face:

Maybe it helps to take a piece of paper and draw a (2D) input volume and an output volume and try to think what the stride does and how this relates the input volume to the output volume. Otherwise, have another look at the Strided Convolutions video!

Isn’t the entire point of strided convolutions to skip rows

In the input volume yes. Not in the output volume.

Should I not be looping over the input volume instead of the output volume? Something along the lines of

for h in range(0, a_prev_pad[1] - f, stride)

Yes, you could do something like that. In the assignment, the choice is made to loop over the output volume. On the one hand this can be a bit confusing; on the other hand, once you get the calculation with the stride correct it becomes quite intuitive.

Maybe this will help. If you move one cell in the output volume, how many cells do you move in the input volume?

Iterating over the input volume makes considerably more sense to me. I tried implementing that for loop but I’m essentially getting the same error. This is the output that I am getting. I’m not sure why the code poops out on that specific iteration.

Say you want to shift two places in an input_index with every place in an output_index, say for a range of 10. What you would get is:

(deleted)

Reversely, if you want to shift one place in an output_index with every two places in
an input_index, you would get:

(deleted)

Whichever you choose you can see that there is a factor of 2 involved between
input_index and output_index as calculated using the iterator variable.

Does this help?

1 Like

It still doesn’t make any sense to me why I would loop over the output volume when I’m taking a slice of the previous A matrix. Also, do you understand why I’m getting that error message, what that error message means, and why i’m getting it on a random iteration?

The error message shows that the value of w is out of bounds.

As to your question about looping over the output volume, whichever way you code it you need to perform this loop because you want to fill every cell with an activation value - that’s what the output volume is there for. You also have to loop over the input volume because this is where you get the values that you want to apply your filters to.

So you’re looping over both your input volume and your output volume as explained in the video on Strided Convolutions. With every strided step in your input volume you take one step in your output volume. So the relation is stride:1. This you have to capture in your loop.

To do this, there are two possibilities. Either you loop over the input volume using the iterator variable, while you loop over your output volume with a relationship of (deleted) inside the loop; or you loop over your output volume using the iterator variable, while you loop over your input volume with a relationship of (deleted) inside the loop. Whichever approach you take, the slices you take from the input volume are based on the loop value for the input volume - which is directly related to the loop value for the output volume.

Phrased yet another way, you can use the iterator value for the output volume and calculate the related index value for the input volume; or you can use the iterator value for the input volume and calculate the related index value for the output volume. It really doesn’t matter for the logic of the process, as long as you take the index value for the input volume to get the slice from the input volume.

So why not just loop over the output volume and use the related index value for the input volume to take a slice? This is what the suggested code proposes. If you want to do it differently you have to change the suggested code, and you still have to make sure you implement the stride correctly, take the correct slice from the input volume, and put the result in the correct cell in the output volume. This is an interesting exercise but it does not form part of the assignment, which makes it tricky for others to provide feedback for.