Vote C5W1 Exercise 8 lstm_backward

Justin · July 14, 2023, 2:09pm

I got program correct but still I don’t understand the concept of some part.
I tried to read thread as much I could but I didn’t find same question.

in excercise 8 istm_backward, we keep on adding dWf as the loop move from T_x to 1.

    dWf += gradients["dWf"]
    dWi += gradients["dWi"]
    dWc += gradients["dWc"]
    ...
    ...

but why do we add up dWf as we each-time go backward(from T to 1)?
shouldn’t it be just like below?

    dWf  = gradients["dWf"]
    dWi  = gradients["dWi"]
    dWc = gradients["dWc"]

…
…

Hope someone explain why we doing this.

paulinpaloalto · July 14, 2023, 2:55pm

The reason is the way that RNNs work: the same cell with the same coefficients is used for each timestep. Each timestep changes the results and so gives gradients during back propagation. The way you take that into account is by adding the gradients at each timestep. Of course you’re also averaging them over the training samples as well, but you may be doing Stochastic Gradient Descent. Of course there is some ambiguity here: the gradients at a given timestep include the gradients at all the later timesteps because of the Chain Rule. Should we apply them only one step at a time and then recompute? That adds another loop to the whole process and would be very inefficient, so we just add them all up at a given iteration. Gradient Descent is an approximation method and is statistical anyway and it apparently works well enough this way.

Here’s another thread from a while back that discusses this same point in some detail. Please start with the linked post and read forward through the thread.

Justin · July 15, 2023, 3:42am

Thanks paulinpaloalto. I went the another thread and have better understanding now.

Topic		Replies	Views
C5W1 Exercise 8 lstm_backward Sequence Models	3	691	January 22, 2024
LSTM backpropagation confusion Sequence Models week-1	2	41	November 12, 2024
Why do we susbtract 1 in rnn backward provided in C5W1 assignment 2 Sequence Models	1	508	April 29, 2023
Week1 Assignment1 Backpro question Sequence Models	3	591	August 16, 2021
Vanishing Gradient RNN Sequence Models	7	536	April 6, 2022

Vote C5W1 Exercise 8 lstm_backward

Related topics