Course 4 Week 1 Exercise 3 - conv_forward Error: Wrong output for variable in position 0

Hi everyone,

I’m really struggeling with exercise 3, course 4, week 1. I get the Error message:

Error: Wrong output for variable in position 0.

The error occurs on line 120 in the multiple_test(test_cases, target) function call. Can you please help me narrow down my mistake and show me what I’m doing wrong?

Here’s what I’m doing this far:

Moderator edit: Removed the specific code description, because it was so specific that it was essentially the code itself. Posting your code on the forum is not allowed by the Code of Conduct.

13 Likes

For the last for loop, f isn’t the number of channels but the filter size of each channel.

5 Likes

Thank you! :slight_smile:

You’re right! That’s where I was wrong the whole time :man_facepalming:

All tests passed.

2 Likes

I’ve been looking at this problem for a looong time and I can’t find my error. My code is basically exactly the same as what OP had outlined but I’m still getting a similar error:

Z's mean =
 -0.09726298936646298
Z[0,2,1] =
 [-2.17796037  8.07171329 -0.5772704   3.36286738  0.          0.
  0.          0.        ]
cache_conv[0][1][2][3] =
 [-1.1191154   1.9560789  -0.3264995  -1.34267579]
(2, 13, 15, 8)
Error: Wrong output for variable in position 0.
 2  Tests passed
 1  Tests failed

Probably the one uncertainty I had was iterating over n_C versus n_C_prev (or f) in the last loop but I get a similar error regardless. I know the most common error is accounting for stride in vert_start and horiz_start but that does not seem to be my issue here. I sincerely appreciate any help!

2 Likes

Wow, @Bakir , thanks for the clear description of your thoughts up until the point where you had the error. I was getting the stride and number of channels right, but was struggling with a previous step. Following your comment really helped me out!

Just wanted to point out here how important is to contextualize and describe your question.

1 Like

Hey @ben_slacker did you follow Bakir’s description closely? Really helped me out. Regarding the last loop, if you still are uncertain, keep in mind the loops iterates over the number of channels of the output, which equals the number of filters in the convolution layer (f is the filter height and width, not the number of filters). If you are still stuck, try to explain your steps her, maybe we can spot a bug.

3 Likes

Apologies for the very delayed response, but I had a silly error where I was iterating over the wrong dimension for one of my loops. Thanks.

1 Like

I am struggling with this also,

I have followed the above reasoning, but I feel this may be a random number generation issue. I get very close to the values in the public tests, but not close enough.

For example for Z[0,0,0,0] I get -2.65112345232583 vs -2.65112363

Any suggestions would be appreciated

2 Likes

I reloaded the assignment and everything now works. Not sure what the issue was

1 Like

Struggling with this also. My logic with Bakir’s post but still fail 1 test:
Z’s mean =
0.11920058835389002
Z[0,2,1] =
[0. 0. 0. 0. 0. 0. 0. 0.]
cache_conv[0][1][2][3] =
[-1.1191154 1.9560789 -0.3264995 -1.34267579]
(2, 13, 15, 8)
Error: Wrong output for variable in position 0.
2 Tests passed
1 Tests failed

I handle stride a little differently and was wondering if this could be the cause of my problem:
for h in range(0, n_H, stride):
for w in range(0, n_W, stride):
Any help would be appreciated. Thanks

2 Likes

OK, I realize that I handled stride incorrectly by striding over the OUTPUT volume rather than the INPUT volume, but still getting 1 failed test:
Z’s mean =
0.009946778332892073
Z[0,2,1] =
[-2.17796037 8.07171329 -0.5772704 3.36286738 0. 0.
0. 0. ]
cache_conv[0][1][2][3] =
[-1.1191154 1.9560789 -0.3264995 -1.34267579]
(2, 13, 15, 8)
Error: Wrong output for variable in position 0.
2 Tests passed
1 Tests failed

I handle stride the following way:
{moderator edit: code removed}

Does this make sense?

2 Likes

Figured it out — in my channel loop, I was iterating over previous channel count instead of output channel count.

2 Likes

Hi everyone,

I have followed all the steps explained by @Bakir but it did not work for me. I am not sure what I have done wrong. I was wondering if someone can help.
here is the error that I got:

IndexError Traceback (most recent call last)
in
6 “stride”: 2}
7
----> 8 Z, cache_conv = conv_forward(A_prev, W, b, hparameters)
9 print(“Z’s mean =\n”, np.mean(Z))
10 print(“Z[0,2,1] =\n”, Z[0, 2, 1])

in conv_forward(A_prev, W, b, hparameters)
80 weights = W[:,:,:,c]
81 biases = b[0,0,0,c]
—> 82 Z[i, h, w, c] = conv_single_step(a_slice_prev, weights, biases)
83 # YOUR CODE ENDS HERE
84

in conv_single_step(a_slice_prev, W, b)
25 s = a_slice_prev * W
26 Z = np.sum(s, axis=None)
—> 27 Z = Z + float(b[0])
28
29 # YOUR CODE ENDS HERE

IndexError: invalid index to scalar variable.

2 Likes

The error message seems pretty clear: it is telling you is that b is a scalar, so it doesn’t make sense to try to index it by saying b[0].

2 Likes

This response was super helpful for me! I was stuck on this question for ages.

1 Like

Hi, quick question…how come in the second for loop, the stride is multiplied not added to the height?

1 Like

You could do it either way: each time around the loop, the height position in the input space increases by the stride value. So either you can accumulate it in a separate variable and add each time around the loop or you can multiply each time. The multiply method saves you from having to create another variable to track the sum.

Note that you have to be careful not to include the stride in the for loop “index” expression, because you are looping over the output space, right? The striding happens in the input space.

2 Likes

I’ve been stuck on this one for days. I’ve printed out the corners of my boxes, the shapes of everything, intermediate results and I’m still getting an error.
Z’s mean =
0.03594490957842903
Z[0,2,1] =
[-2.17796037 8.07171329 -0.5772704 3.36286738 4.48113645 -2.89198428
10.99288867 3.03171932]
cache_conv[0][1][2][3] =
[-1.1191154 1.9560789 -0.3264995 -1.34267579]
First Test: Z’s mean is incorrect. Expected: 0.5511276474566768
Your output: 0.03594490957842903

Any clues would be appreciated.

1 Like

There are lots of potential errors in this code. If you’ve read through this thread and checked your handling of the stride values (applied in the input space, not the output space), then maybe it’s time to look at your code. Please check your DMs …

1 Like

I am trying to follow @Bakir’s logic description and have come up with the following running log:

m:2, n_H:3, n_W:4, n_C:8
b: [[[[-1.39881282  0.08176782 -0.45994283  0.64435367  0.37167029
     1.85300949  0.14225137  0.51350548]]]]
A_prev.shape: (2, 5, 7, 4) a_prev_pad: (5, 7, 4)
vstart: 1 vend:4
hstart: 1 hend: 4
a_slice_prev.shape: (3, 3, 4) weights: (3, 3, 4) bias:-1.3988128186664763
a_slice_prev.shape: (3, 3, 4), W.shape: (3, 3, 4)
a_slice_prev.shape: (3, 3, 4) weights: (3, 3, 4) bias:0.08176781880561644
a_slice_prev.shape: (3, 3, 4), W.shape: (3, 3, 4)
a_slice_prev.shape: (3, 3, 4) weights: (3, 3, 4) bias:-0.45994283084068716
a_slice_prev.shape: (3, 3, 4), W.shape: (3, 3, 4)
a_slice_prev.shape: (3, 3, 4) weights: (3, 3, 4) bias:0.6443536660303223
a_slice_prev.shape: (3, 3, 4), W.shape: (3, 3, 4)
a_slice_prev.shape: (3, 3, 4) weights: (3, 3, 4) bias:0.37167029121186534
a_slice_prev.shape: (3, 3, 4), W.shape: (3, 3, 4)
a_slice_prev.shape: (3, 3, 4) weights: (3, 3, 4) bias:1.853009485069379
a_slice_prev.shape: (3, 3, 4), W.shape: (3, 3, 4)
a_slice_prev.shape: (3, 3, 4) weights: (3, 3, 4) bias:0.14225137252631778
a_slice_prev.shape: (3, 3, 4), W.shape: (3, 3, 4)
a_slice_prev.shape: (3, 3, 4) weights: (3, 3, 4) bias:0.5135054799885475
a_slice_prev.shape: (3, 3, 4), W.shape: (3, 3, 4)
hstart: 1 hend: 4
a_slice_prev.shape: (3, 3, 4) weights: (3, 3, 4) bias:-1.3988128186664763
a_slice_prev.shape: (3, 3, 4), W.shape: (3, 3, 4)
a_slice_prev.shape: (3, 3, 4) weights: (3, 3, 4) bias:0.08176781880561644
a_slice_prev.shape: (3, 3, 4), W.shape: (3, 3, 4)
a_slice_prev.shape: (3, 3, 4) weights: (3, 3, 4) bias:-0.45994283084068716
a_slice_prev.shape: (3, 3, 4), W.shape: (3, 3, 4)
a_slice_prev.shape: (3, 3, 4) weights: (3, 3, 4) bias:0.6443536660303223
a_slice_prev.shape: (3, 3, 4), W.shape: (3, 3, 4)
a_slice_prev.shape: (3, 3, 4) weights: (3, 3, 4) bias:0.37167029121186534
a_slice_prev.shape: (3, 3, 4), W.shape: (3, 3, 4)
a_slice_prev.shape: (3, 3, 4) weights: (3, 3, 4) bias:1.853009485069379
a_slice_prev.shape: (3, 3, 4), W.shape: (3, 3, 4)
a_slice_prev.shape: (3, 3, 4) weights: (3, 3, 4) bias:0.14225137252631778
a_slice_prev.shape: (3, 3, 4), W.shape: (3, 3, 4)
a_slice_prev.shape: (3, 3, 4) weights: (3, 3, 4) bias:0.5135054799885475
a_slice_prev.shape: (3, 3, 4), W.shape: (3, 3, 4)
hstart: 3 hend: 6
a_slice_prev.shape: (3, 3, 4) weights: (3, 3, 4) bias:-1.3988128186664763
a_slice_prev.shape: (3, 3, 4), W.shape: (3, 3, 4)
a_slice_prev.shape: (3, 3, 4) weights: (3, 3, 4) bias:0.08176781880561644
a_slice_prev.shape: (3, 3, 4), W.shape: (3, 3, 4)
a_slice_prev.shape: (3, 3, 4) weights: (3, 3, 4) bias:-0.45994283084068716
a_slice_prev.shape: (3, 3, 4), W.shape: (3, 3, 4)
a_slice_prev.shape: (3, 3, 4) weights: (3, 3, 4) bias:0.6443536660303223
a_slice_prev.shape: (3, 3, 4), W.shape: (3, 3, 4)
a_slice_prev.shape: (3, 3, 4) weights: (3, 3, 4) bias:0.37167029121186534
a_slice_prev.shape: (3, 3, 4), W.shape: (3, 3, 4)
a_slice_prev.shape: (3, 3, 4) weights: (3, 3, 4) bias:1.853009485069379
a_slice_prev.shape: (3, 3, 4), W.shape: (3, 3, 4)
a_slice_prev.shape: (3, 3, 4) weights: (3, 3, 4) bias:0.14225137252631778
a_slice_prev.shape: (3, 3, 4), W.shape: (3, 3, 4)
a_slice_prev.shape: (3, 3, 4) weights: (3, 3, 4) bias:0.5135054799885475
a_slice_prev.shape: (3, 3, 4), W.shape: (3, 3, 4)
hstart: 5 hend: 8
a_slice_prev.shape: (3, 2, 4) weights: (3, 3, 4) bias:-1.3988128186664763
a_slice_prev.shape: (3, 2, 4), W.shape: (3, 3, 4)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-51-7e580406a9e8> in <module>
      6                "stride": 2}
      7 
----> 8 Z, cache_conv = conv_forward(A_prev, W, b, hparameters)
      9 z_mean = np.mean(Z)
     10 z_0_2_1 = Z[0, 2, 1]

<ipython-input-50-342532606697> in conv_forward(A_prev, W, b, hparameters)
     86                     bias = b[0, 0, 0, c]
     87                     print(f'a_slice_prev.shape: {a_slice_prev.shape} weights: {weights.shape} bias:{bias}')
---> 88                     Z[i, h, w, c] = conv_single_step(a_slice_prev, weights, bias)
     89                     pass
     90     # YOUR CODE ENDS HERE

<ipython-input-40-ae92196f3681> in conv_single_step(a_slice_prev, W, b)
     24     # YOUR CODE STARTS HERE
     25     print(f'a_slice_prev.shape: {a_slice_prev.shape}, W.shape: {W.shape}')
---> 26     s = a_slice_prev * W
     27     Z = np.sum(s)
     28     Z += np.asscalar(b)

ValueError: operands could not be broadcast together with shapes (3,2,4) (3,3,4)

When hstart=5 and hend=8, the matrix shapes don’t match: (3,2,4) & (3,3,4).

Based on n_H and n_W formula, I got the following formula to caculate:

# moderator edit; code removed

Is this in the right direction? Also another question that might cause the issue is that, I didn’t make use of this reminder:

Use array slicing (e.g.`varname[0:1,:,3:5]` ) for the following variables:
`a_prev_pad` ,`W` , `b`

Instead, without using slicing as above, I got a_prev_pad by indexing into A_prev through the ‘i’, which is the training example index.

1 Like