Vote Week 1 Assignment 1 RNN_backward Exercise 6: wrong output data

Hello, I try to complete the def rnn_backward(da, caches) in Exercise 6 - rnn_backward
The results in Exercices 5 are correct with the test data, but in 6 not. The shapes are ok, but the data not. I tried to check the posts here, saw Im not only one, but cant figure out the problem.
I dont want to past here the code,

my output is:
gradients[“dx”][1][2] = [0.04036334 0.01590669 0.00395097 0.01483317]
gradients[“dx”].shape = (3, 10, 4)
gradients[“da0”][2][3] = -0.0007053016291385033
gradients[“da0”].shape = (5, 10)
gradients[“dWax”][3][1] = 8.452426371294356
gradients[“dWax”].shape = (5, 3)
gradients[“dWaa”][1][2] = 1.2707651799408062
gradients[“dWaa”].shape = (5, 5)
gradients[“dba”][4] = [-0.50815277]
gradients[“dba”].shape = (5, 1)

thanks

Please click my name and message your notebook as an attachment.

Please fix your call to rnn_cell_backward to include the gradient at time t and the incoming gradient. In your implementation, only the incoming gradient is used.
Another hint: Look at da.

thanks, solved. I used original da with t time but didint added the new da.

1 Like

I am also getting no error but my output is different from Expected Output. Can I send you my code? below is my output.

gradients[“dx”][1][2] = [-0.15028183 -0.34554547 0.02071758 0.01483317]
gradients[“dx”].shape = (3, 10, 4)
gradients[“da0”][2][3] = -0.17268893183890754
gradients[“da0”].shape = (5, 10)
gradients[“dWax”][3][1] = 4.081485734449453
gradients[“dWax”].shape = (5, 3)
gradients[“dWaa”][1][2] = 1.056012342849445
gradients[“dWaa”].shape = (5, 5)
gradients[“dba”][4] = [-0.12427391]
gradients[“dba”].shape = (5, 1)

Check the call finction rnn_cell_backward(…)

[snippet removed by mentor]

Hello everyone, first of all thanks to those of you who have posted here about this issue. It was helpful for me to fix some bugs in the run_cell_backward() function.

Now I have the same problem as Justin. In fact, my code found the same gradients as he did. I have already checked the dimensions of the inputs in the function rnn_cell_backward(), da and cache to ask only for the slice t, but did not get any other gradient values. Also, the gradients of the rnn_cell_backward() function passed all previous tests. So I am quite lost as to how to solve this problem…

I would be very grateful for any ideas or comments!

It is stated in the notebook that Choose wisely the "da_next" and the "cache" to use in the backward propagation step.

I guess the problem is with how you are choosing the "da_next". It is not only the da (slice of t). You have to add the gradient of the loss with respect to the hidden state at time step t-1.

Please read the below text from the notebook:

  • Note that this notebook does not implement the backward path from the Loss ‘J’ backwards to ‘a’.
    • This would have included the dense layer and softmax which are a part of the forward path.
    • This is assumed to be calculated elsewhere and the result passed to rnn_backward in ‘da’.
    • You must combine this with the loss from the previous stages when calling rnn_cell_backward (see figure 7 above).

In other words, you have to add da_prevt with da (slice of t)

Thanks a lot! resolved by using da[:,:,t] + da_prevt!

1 Like

I came across the same issue; using da[:,:,t] + da_prevt fixed it for me too, but I don’t know why it’s that and not simply da[:,:,t].

The gradients are the sums of the gradients at each timestep. The action is depicted in Figure 7:


That is what is happening in that “+” sign in the green oval that I added at the right hand side of the diagram. It is the da for the current timestep plus the cumulative sum of all the da values from the later timesteps, which is da_{prev} from the point of view of those later timesteps. You can see the current “step” feeding the next da_{prev} off the left side of the diagram to the previous timestep. Of course this is “back prop”, so we are going backwards and in an RNN it’s “backwards in time”, right? Because there is just one “layer” but we repeat it over and over and feed the results forward.

1 Like