DLS course 5 week 1 RNN forward

I have the following error:

ValueError Traceback (most recent call last)
9 parameters_tmp[‘by’] = np.random.randn(2, 1)
—> 11 a_tmp, y_pred_tmp, caches_tmp = rnn_forward(x_tmp, a0_tmp, parameters_tmp)
12 print(“a[4][1] = \n”, a_tmp[4][1])
13 print(“a.shape = \n”, a_tmp.shape)

in rnn_forward(x, a0, parameters)
43 a_next, yt_pred, cache = rnn_cell_forward(x[:,:,t], a_next, parameters)
44 # Save the value of the new “next” hidden state in a (≈1 line)
—> 45 a[:,:,t] = a_next
46 # Save the value of the prediction in y (≈1 line)
47 y_pred[:,:,t] = yt_pred

ValueError: could not broadcast input array from shape (5,10) into shape (3,10)

I am wondering if it has to do with how I am initializing a and y_pred with zeros. I’m using the np.zeros function with x as for both of them since x has dimensions nx, m, tx and a and y_pred have dimensions na, m, tx, and ny, m, and tx respectively. I know that nx and ny are the same but maybe na is 1 off from nx becuase of the initial activation? Thats just a thought I had as a possibility.

No, ‘with x’ is not correct.
The instructions tell you what to use:

1 Like

Thank you. I understand that this is the directions. but at the end of the day I’m just creating an array of zeros of the correct shape. So couldn’t I use any same shaped item to create a zero array? Your point very well be my problem in the sense that I was saying that I’m using x and x might not be the right shape, although close. So my two questions are couldn’t you get the right size array of zeros by any means that arrives at that correct sized array of zeros, and secondly. Is there a discrepancy between the shape of x and shape of a like I was hinting at, such as a being one larger than x becuase of the activation zero part of a. Thanks

shapes and zeros have been haunting me throughout this whole specialization. I feel like I understand them but then a new problem hits me. If Paul were here he’d be getting on my case about using them wrong here again haha. He’s right but still. I tried this way:

a = np.zeros(np.shape(n_a), np.shape(x[1]), np.shape(x[2]))

but was again stopped

n_a has literally nothing to do with n_x, right? a is the “hidden state” and x is the input.

If this is not clear to you, then it might be worth watching the lectures again.

I’m using x here because x is (nx, m, and tx) and I need m and tx for a so I just indexed them. I have na already so I can use that directly.

Sorry, our messages crossed “in midair”. But what does np.shape(n_a) have to do with the value of n_a? Even if n_a were not a scalar, there would still be no relationship between shape and value, right?

Also note that np.shape(x[1]) is 1, right? Maybe you meant np,shape(x)[1]. That’s not the same thing.

n_a is the number of units in the vector. It says to initialize the ‘a’ vector with n_a number of units as index ‘0’ so I was doing that. I can’t use ‘a’ yet because that is what i’m initializing.

to comment on “nothing to do with nx” I understand this, and maybe its bad programming. but if nx was equivalent in shape to na and I had access to nx then couldn’t I still GET what I wanted by using the shape of nx as na(again I realize na and nx may be different sizes). It’s kinda like if tx and ty are equal, I could use their shapes interchangeably. Or I could just hardcode in the numbers.

I know x is the input, but since that is a given value in this scope I was trying to use the items that I have access to within the function. I guess.

if n_a as a scalar number of hidden units was 100, then yes I guess it wouldn’t make sense to zero the scalar. By using n_a.shape I was trying to take the say (100,1) array shape that n_a is and zero that.

A scalar doesn’t have a shape, right? Only a numpy array has a shape.

Note that you are given x and a0 as parameters to this function. You can use a0 to find the number of elements in a, right? Have you tried printing what you get from a0.shape?

a0 is an array of shape 5,10. (n_a), m. so I should be able to index a0 for n_a which I need for a.

since I want ‘a’ which is (n_a, m, T_x) and a0 is (n_a, m) and x[2] is T_x. I should be able to use a = a0[0], a0[1], x[2] to get the right shapes…

I think your basic idea is correct, but you’re confusing the values with the shapes again. What is a0[0]? Try printing it out and see what you get.

It’s late in my time zone (UTC +2 for the moment), so I am signing off until tomorrow.

1 Like

a0[0] is a 1x10 array

Sounds good. Thank you!

@zac_builta, this is a reminder that the hint tells you exactly what variable names to use.

Indexing the variables is a mistake. That will give you the contents of an element of the variable. What you want is the size of the entire variable.

Right, because a0 is a 2 dimensional numpy array and you’ve “sliced” it on the first dimension. So now try this:

print(f"a0.shape = {a0.shape}")

What does that give you?

This is just basic numpy stuff. I’ve commented at least a couple of times above that the shape of an object and the values of an object are not the same.