Course4 Week 1 Assignment 1: Conv_forward

I am getting the following error shown below:

Z’s mean =
Z[0,2,1] =
[-5.42280893 1.88549165 -4.09974126 -3.48941271 -5.34564475 3.56732833
1.77640635 -2.33586583]
cache_conv[0][1][2][3] =
[-1.1191154 1.9560789 -0.3264995 -1.34267579]

ValueError Traceback (most recent call last)
11 print(“cache_conv[0][1][2][3] =\n”, cache_conv[0][1][2][3])
—> 13 conv_forward_test(conv_forward)

~/work/release/W1A1/ in conv_forward_test(target)
65 b = np.random.randn(1, 1, 1, 8)
—> 67 Z, cache_conv = target(A_prev, W, b, {“pad” : 3, “stride”: 1})
68 Z_shape = Z.shape
69 assert Z_shape[0] == A_prev.shape[0], f"m is wrong. Current: {Z_shape[0]}. Expected: {A_prev.shape[0]}"

in conv_forward(A_prev, W, b, hparameters)
82 weights = W[:,:,:,c]
83 biases = b[:,:,:,c]
—> 84 Z[i,h,w,c] = conv_single_step(a_slice_prev, weights, biases)

in conv_single_step(a_slice_prev, W, b)
23 # Z = None
—> 25 s = np.multiply(a_slice_prev,W)
26 Z = np.sum(s)
27 Z = Z+ float(b)

ValueError: operands could not be broadcast together with shapes (3,2,4) (3,3,4)

After further inspection, I believe my error is in my horizontal stride calculations, however I am just confused about how to index. I have tried the whole day trying to figure out the indexing the corners but I am getting no where. I was wondering if there are hints. I have looked in the discourse from past examples but I dont seem to find new information to help me with this. Thank you for the time and help

I would like to add that I did include stride into the horiz_start as a multiple of the stride with respect to the loop

That sounds correct. The key point is that the loops are over the output space: you step through each position in the output space, not skipping any. Then at each point, you have to calculate where that maps to in the input space. For that you multiply by the stride, as you described.

The other possibility is that you did not calculate the shape of the output object correctly. Here’s a thread which shows some examples of all the calculated dimensions.

The other clue is that the error occurs on the width dimension. The height and width are handled the same way, right?

The other thing to investigate is when this error gets thrown: is it on the very first iteration or do you actually run off the end?

Hello pualinpaloalto, so I checked my dimensions and I actually found my error, I when I first intialized my a_prev_pad, I was grabing from A_prev instead of using A_prev_pad. However, I get a new error in the my code listed below:

ValueError Traceback (most recent call last)
6 “stride”: 2}
----> 8 Z, cache_conv = conv_forward(A_prev, W, b, hparameters)
9 print(“Z’s mean =\n”, np.mean(Z))
10 print(“Z[0,2,1] =\n”, Z[0, 2, 1])

in conv_forward(A_prev, W, b, hparameters)
83 weights = W[:,:,:,c]
84 biases = b[:,:,:,c]
—> 85 Z[i,h,w,c] = conv_single_step(a_slice_prev, weights, biases)

in conv_single_step(a_slice_prev, W, b)
23 # Z = None
—> 25 s = np.multiply(a_slice_prev,W)
26 Z = np.sum(s)
27 Z = Z+ np.float64(b)

ValueError: operands could not be broadcast together with shapes (3,3,2) (3,3,4) . Now this looks like an error with the channels

It checked it only goes through one iteration of the c for loop and gives me an error.

The channels should be the easy case. :nerd_face: At each position in the output space, the input should be f x f x nC_{prev}, right?

see, that is a bit of confusion on my end about W.
I know W has four dimensions f x f x nC_prev x nC. Since we are looping over nC, we will be using f x f x nC_prev. But I am still not seeing my error in this loop now.
I have the following code in my program and I thought the nC_prev param would be handled:
a_slice_prev = a_prev_pad[vert_start:vert_end,horiz_start:horiz_end,:]
weights = W[:,:,:,c]
biases = b[:,:,:,c]
Z[i, h, w, c] = conv_single_step(a_slice_prev, weights, biases)

Channels vs filters I am using this previous discourse as a reference

Yes, that thread explains the input and output channels. Your logic for indexing a_prev_pad to create a_slice_prev also looks correct. So that must say that your a_prev_pad is the wrong shape: the third dimension is too small. So how did that happen? Note the problem is not with W, it should be 3 x 3 x 4 and it is. The problem is the first operand is the wrong shape, which is a_slice_prev in that case.

ok, I had a suspicion that the slice was the one that was being indexed wrong. However, from my code I have
a_slice_prev = a_prev_pad[vert_start:vert_end,horiz_start:horiz_end,:]
I agree that a_prev_pad is the wrong shape. For a_prev_pad, I am grabbing every ith traning example from A_prev_pad as the following:
a_prev_pad = A_prev_pad[i,:,:,:] . This is done after every for loop for m. Am I indexing from the wrong source?

What is the shape of A_prev_pad? That seems to be the key. If that is wrong, then everything else will fail.

The shape of A_prev_pad is (4, 5, 5, 2),

I am using the zero_pad function made earlier in the assignment and I use A_prev & pad as inputs. My zero_pad function passes all the tests as well

But the shape of A_prev in the conv_forward test case is 2 x 5 x 7 x 4 and pad = 1. So the dimension of A_prev_pad should be 2 x 7 x 9 x 4, right? So something is wrong somewhere. You need to figure out what happened.

interesting, I fixed the problem, well a temporary fix. My zero_pad function was not outputting the correct padded output. Instead, I just put in the zero_pad functionality to my code and it worked. I dont know why this happened, is there someone I can contact?

If you have the correct implementation of padding in conv_forward, then why doesn’t it work in zero_pad? I don’t know who else to contact.

That is what I am trying to figure out. This doesnt make sense to me. The exercise for zero_pad outputs the correct dimensions. I dont know why I am getting a wrong output when it is being used in my conv_forward(

There must be a bug in your zero_pad routine. Just because it passes the test case in the notebook does not mean it is completely correct. E.g. one type of error is referencing global variables rather than the parameters that are passed into the routine. With that kind of bug, it is possible to pass the test cases for the individual routine without being fully correct.