Week1 Assignment1 Backpro question

Why do we need to use (da[:,:,t]+da_prevt) instead of only (da_prevt) in the for loop in the rnn_backward block? I just couldn’t get the ituition of why “adding gradients” would work there.

I also have the same question. Can anyone help with this?


da_prevt comes from later RNN cells.

da[:,:,t] comes from softmax and dense layers related to y and it is calculated elsewhere and just passed to our function.

I hope it makes sense.

1 Like

thanks it makes sense to me now