The gradients are the sums of the gradients at each timestep. The action is depicted in Figure 7:
That is what is happening in that “+” sign in the green oval that I added at the right hand side of the diagram. It is the da for the current timestep plus the cumulative sum of all the da values from the later timesteps, which is da_{prev} from the point of view of those later timesteps. You can see the current “step” feeding the next da_{prev} off the left side of the diagram to the previous timestep. Of course this is “back prop”, so we are going backwards and in an RNN it’s “backwards in time”, right? Because there is just one “layer” but we repeat it over and over and feed the results forward.