C5W1A1Exercise3.1

hi, there,
I am trying to work out the ungraded exercise 3.1, I can run the code but result is wrong.
I believe my mistake shall be the initiation of da_prevt. What I am doing now is:

  1. assign da_prevt with da[: , : , T_x-1] to pick the last array,
  2. when t reverse loop T_x, call function ‘gradients’ with parameters:
    “da_prevt” and caches[t]

May I have some guidance about what am I doing wrong? Many thanks!

Leon

Hi Leon,

When we compute gradients for a step in rnn_backward, the gradients depend on da which is upstream gradients (from the loss backward to a), and da_prevt which is actually gradients from subsequent rnn cell. Note that, da and da_prevt are different.
Hence, in rnn_cell_backward function, you should include both da of current step and da_prevt of previous backward step.
If you still get stuck, my additional hint is to try da[:, :, t] + da_prevt for rnn_cell_backward. Note that da_prevt for the first backward step is zeros because it is the last cell.
Hope this helps.

K

2 Likes

hi, kienmn,

many thanks for the reply. Sorry I didn’t explain my issue clearly.
I have past the “rnn_cell_backward” and the result is same of expected. I am actually stuck at “rnn_backward” part.

I can run the code, but the result is different from expected, as below:

I suspect my mistake happens when retrieve the value for da_prevt and caches. What I write here is:
da_prevt = da[:,:,-1]

and then in the loop, I past two parameters to rnn_cell_backward as below:
gradients = rnn_cell_backward(da_prevt, caches[t])

Am I making some mistakes here?

many thanks for your attention!

Leon

You should re-read the answer by @kienmn because he basically spelled it out. But, again, to summarize… You should initialize da_prevt = np.zeros((n_a, m)) like every other gradient. Then, within the loop, call rnn_backward with da[:, :, t] + da_prevt

1 Like

thank you so much @Yuriy and @kienmn ! problem solved.

Thanks. da[:,:,t]+ da_prev helps and test passes