Hello dear Deeplearning.ai community.
I can’t seem to be able to get the correct results for RNN back prop. I have correct shapes of the tensors, but the values are different.
It’s written that I have to choose da_next
and cache
wisely.
Something tells me that I didn’t exhibit the necessary wisdom choosing them. Otherwise, I don’t see any other places where I could make a mistake.
I’m wondering if anybody can hint me at how to pick da_next
and cache
please.
Thank you in advance.
Ivan
If anybody is interested, I found a solution.
The problem is with choosing da_next
. Read the third bullet of the instructions provided before the exercise: " You must combine this with the loss from the previous stages when calling rnn_cell_backward
(see figure 7 above)."
Honestly speaking, when reading the instructions I didn’t understand at all what that meant and just forgot about that.
Here are my comments that can help one to understand what that means and choose da_next
“wisely”.
First, it refers to figure 7. But if you, actually, use figure 7 you will “combine this” with a^<t>
instead of the “loss from the previous stages” as mentioned in the instruction (see figure 7 da[:,;,t]
in the blue box being added to a^<t>
coming from the right and resulting in da_next).
Second, the word “loss” in the instruction actually means derivative of the loss function with respect to a
at the previous time step - not the loss function itself as one might think.
I tried this and worked!:
you can see da_next like this: da[:,:,t] + da_prevt
where da_prevt is initialized with zeros for the last rnn cell.
please consider that da_prevt is updated in each reversal iteration of “t”