Prograaming Assignment 1 - Week 1 - Exercise 6 - rnn_backward

Hi, can somebody explain to me what am I doing wrong here? Thank You

It looks like your a_next value has shape (5, 10), but the da_next value has shape (5, 10, 4). The gradient should be the same shape as the base object, so the next question to answer is which one is the correct shape? Then how did the wrong one end up the wrong shape?

I added some print statements to my rnn_cell_backward function to show the shapes and then ran the test cell for rnn_backward. Here’s what I see:

da_next.shape = (5, 10)
a_next.shape = (5, 10)
da_next.shape = (5, 10)
a_next.shape = (5, 10)
da_next.shape = (5, 10)
a_next.shape = (5, 10)
da_next.shape = (5, 10)
a_next.shape = (5, 10)
gradients["dx"][1][2] = [-2.07101689 -0.59255627  0.02466855  0.01483317]
gradients["dx"].shape = (3, 10, 4)
gradients["da0"][2][3] = -0.31494237512664996
gradients["da0"].shape = (5, 10)
gradients["dWax"][3][1] = 11.264104496527777
gradients["dWax"].shape = (5, 3)
gradients["dWaa"][1][2] = 2.303333126579893
gradients["dWaa"].shape = (5, 5)
gradients["dba"][4] = [-0.74747722]
gradients["dba"].shape = (5, 1)

So why does your da_next end up having 3 dimensions instead of 2?

1 Like

Hi @paulinpaloalto , I just printed the shapes of my a_next and da_next and they both have the same shape, i.e., (5,10). I have also passed the exercises that calculated these values, so how could I get the shape (5,10,4) for da_next?

Hi @Syed_Hamza_Tehseen!

Check your implementation of rnn_cell_backward in the rnn_backward.
Given that:

# Compute gradients at time step t. Choose wisely the "da_next" and the "cache" to use in the backward propagation step. (≈1 line)

Hint: da shape is 3D, (5, 10, 4), but you have to choose only 2D form. All the values from the 1st and 2nd dimensions but only a single value from the third. It just needs some Python skills how to do this.

Best,
Saif.

Hi @saifkhanengr , I understand your point, I have implemented your suggested method but now I’m getting an IndexError although I have initialized da_prevt with the same shape as da i.e, (n_a, m, T_x. I don’t understand whi it’s throwing the error Too many indices for array

Your implementation is still incorrect. You just need to call the 2D shape from the one which is 3D. You can pass the rest which is already 2D as it is.

Hint:

da_prevt shape: (5, 10)
da shape: (5, 10, 4)

Now think about it. From which do you need to choose 2D? and which one can go as it is

Yes, It’s working now. Thank You. Can you tell me why is da a 3D array and da_prevt a 2D array? Why don’t both of them have the same shape?

The answer is in the Notebook:

da -- Upstream gradients of all hidden states, of shape (n_a, m, T_x)
da_prev -- Gradients of previous hidden state, of shape (n_a, m)