Week 1 , assignment 1, Calculating Weights and Biases

In the programming assignment 1 for week 1. For function conv_forward() I am stuck on final step of calculating weights and bias and Z.

Could please guide me in right direction on how to do that.

Thank You!!

1 Like

The way to think about this is that the innermost loop is over the output channels, right? There is a separate set of weights and bias for each output channel. So you need to index those arrays to select the appropriate index of the last (output channel) dimension. In the case of Z, you are just assigning to the appropriate set of indices for that as well. The indices are given by the current state of all the “loop control” variables, right?

I’m also having trouble. In particular, I don’t understand how to “slice the A_prev_pad array” to pull out in the filter shaped (3x3x4? instead of 3x9x4). I’m not understanding how the “a_slice_prev” is used in the slicing. Any document or hint on how to extra the right slice matrix?

I added some print statements to my conv_forward function to show the shapes. Here’s what I get for the standard test case in the notebook:

New dimensions = 3 by 4
Shape Z = (2, 3, 4, 8)
Shape A_prev_pad = (2, 7, 9, 4)
a_prev_pad shape = (7, 9, 4)
Shape a_slice_prev = (3, 3, 4)

Remember that the first dimension is the “samples” dimension. The outer loop is over the samples. So each sample input is (7,9,4). Then you step through that with the filter size and stride size. The other thing to keep in mind is that the three inner loops are all over the output volume, right? So for each value of h in the output, your job in the loop is to compute where that maps to in the input volume. You compute vert_start and vert_end. Similarly for w you compute horiz_start and horiz_end.

While not 100% confident, I got the loops and various variables correct (if not I can fix them). Agree with those shapes, except how is a_slice_prev = (3, 3, 4)? However, when calculating Z, do we use at the end, I’m having trouble calling with right A_prev_pad size.

Z[i, h, w, c] = conv_single_step(A_prev_pad[a_slice_prev**???**], weights, biases)

Paul, Thank you for your help. The “Shape a_slice_prev = (3, 3, 4)” helped. I was assuming that the variable was the starting coordinate (vert_start, horiz_start, “channel”). Didn’t realize they were pulling the slice out of A_prev_pad. It’s now mostly working.

1 Like

I went through a lot of posts , and have all correct code , but i am getting below error.

Z’s mean =
Z[0,2,1] =
[-2.32126108 0.91040602 2.31852532 0.98842271 3.31716611 4.05638832
-2.48135123 0.95872443]
cache_conv[0][1][2][3] =
[-1.1191154 1.9560789 -0.3264995 -1.34267579]

ValueError Traceback (most recent call last)
11 print(“cache_conv[0][1][2][3] =\n”, cache_conv[0][1][2][3])
—> 13 conv_forward_test(conv_forward)

~/work/release/W1A1/public_tests.py in conv_forward_test(target)
79 W = np.random.randn(5, 5, 4, 8)
80 b = np.random.randn(1, 1, 1, 8)
—> 81 Z, cache_conv = target(A_prev, W, b, {“pad” : 6, “stride”: 1})
82 Z_shape = Z.shape
83 print(Z_shape)

in conv_forward(A_prev, W, b, hparameters)
72 weights = W[:,:,:,c]
73 biases = b[:,:,:,c]
—> 74 Z[i, h, w, c] = conv_single_step(a_slice_prev, weights, biases)

in conv_single_step(a_slice_prev, W, b)
26 # Element-wise product between a_slice_prev and W. Do not add the bias yet.
—> 27 s =a_slice_prev*W
28 # Sum over all entries of the volume s.
29 Z = np.sum(s)

ValueError: operands could not be broadcast together with shapes (5,4,4) (5,5,4)

I think it is related to conv_single_step :slight_smile:

Element-wise product between a_slice_prev and W. Do not add the bias yet.

s =a_slice_prev * W

but not sure how to fix this.

Also , I am getting below error for pool_forward

ValueError Traceback (most recent call last)
4 hparameters = {“stride” : 1, “f”: 3}
----> 6 A, cache = pool_forward(A_prev, hparameters, mode = “max”)
7 print(“mode = max”)
8 print("A.shape = " + str(A.shape))

in pool_forward(A_prev, hparameters, mode)
51 if mode == “max”:
—> 53 A[i, h, w, c] = np.max(a_prev_slice)
55 elif mode == “average”:

<array_function internals> in amax(*args, **kwargs)

/opt/conda/lib/python3.7/site-packages/numpy/core/fromnumeric.py in amax(a, axis, out, keepdims, initial, where)
2666 “”"
2667 return _wrapreduction(a, np.maximum, ‘max’, axis, None, out,
→ 2668 keepdims=keepdims, initial=initial, where=where)

/opt/conda/lib/python3.7/site-packages/numpy/core/fromnumeric.py in _wrapreduction(obj, ufunc, method, axis, dtype, out, **kwargs)
88 return reduction(axis=axis, out=out, **passkwargs)
—> 90 return ufunc.reduce(obj, axis, dtype, out, **passkwargs)

ValueError: zero-size array to reduction operation maximum which has no identity

I think it is related to

Use the corners to define the current slice on the ith training example of A_prev, channel c. (≈1 line)

                a_prev_slice = A_prev[vert_start:vert_end,horiz_start:horiz_end]

looks like error in slicing. Please help!!

The first thing to realize is that it is a general principle of debugging that just because the error is thrown in conv_single_step does not mean that’s where the bug is. Most likely it means you are passing incorrect arguments from conv_forward to conv_single_step. Please see my earlier posts for the dimensions that you should be seeing for everything. At that stage both a_slice_prev and W should be 3 x 3 x 4. So why did they not turn out that way?

@sonali: for the pool_forward case, remember that A_prev is a 4 dimensional array. You are looping over the samples (the first dimension) as the “outer” loop and the “channel” dimension (the last dimension) as the “innermost” loop. But you need to specify all 4 index values for A_prev at that point. You are only specifying 2. That is probably why you’re seeing that error get thrown.

Hi, I have been stuck on this assignment also for over a week, can’t tell what I am doing wrong. Any tips or explanation of what I am missing would be really helpful, I am unable to proceed with the course due to this. I get the error [Error: Wrong output for variable in position 0.].
Its resolved now, was meant to multiply corners with stride not add

Here, a_slice_prev is 3D matrix. Howerver, Weights and Biases is 4D and it is sets of 3D matrixes . You need to dissect it to 3D matrixes for each iteration.

1 Like

Hi Paul,
Good afternoon!
I am struggling with the W x a_slice_prev dimensions in the innermost loop.

I printed the below:

(3, 3, 4)

But W has the following shape:

(3, 3, 4, 8)

I am iterating the inner loopgwith c from 0 to n_C. I know I have to take extract weights from W using this c index, but I found no way so far.

Any tips?


I have the same confusion as cserranobr. We have only 4 loops from top down:

  1. samples, i in range(m)
  2. heights, h in range (a_prev_pad.shape[0])
  3. width, w in range (a_prev_pad.shape[1])
  4. channel, c in range(a_prev_pad.shape[2])

up to this stage, we slice a_prev_pad at the dimension of (f,f,c), but weights has 4th dimension: number of filters. Each filter shall multiply a_prev_pad + b to get the Z with shape (f,f,c,n_c). However in the code, there is no 5th loop.

What did I miss here? any someone help?


W and b is 4D matrix. You can see it has a shape [f,f,c,n_c]. The solution here is from the 4D matrix through loops to separate the 3D matrix

1 Like

thanks TeSyuq!
does that mean that I shall add on one more loop to slice weights and biases, such as :
for x in range(n_C):
weights = W[:, :, :, x]
biases = b[:, :, :, x]
then proceeds to conv_single_step

You don’t need add more loop to do it because 4th loop will solve this problem


thanks a lot! I figure it out!

I made two mistakes: 1) conv_single_step function is element product, so no need to consider the loop the channel. 2) h and w not from the shape of a_prev_pad, and shall consider stride

thanks for the help!


Hi Leon,

I am still confused. How do we extract a 3D matrix from a 4D matrix?

I created a simple snippet to try it in Python:

(3, 3, 4, 8)

How do we iterate over the last index (it should be c in our inner loop, right?)?

I tried the below, but that is not right (we should extract a shape 3,3,4 for W):

(3, 4, 8)


Hi All,

I think I finally got it by using the construct [:,:,:,index].