I have a question regarding the following statement from the BackPropagation part of Programming Assignment 1 in Week 1:
Note: rnn_cell_backward does not include the calculation of loss from 𝑦⟨𝑡⟩ .
This is incorporated into the incoming da_next. This is a slight mismatch
with rnn_cell_forward, which includes a dense layer and softmax.
If the above is for time step t, based on the description, is it saying that the da_{next} for t - 1 is not da_{prev} = W_{aa}^T dtanh as presented above, but da_{prev} + \frac{dL(y^{<t-1>},\hat y^{<t-1>})}{da^{<t-1>}}, but the latter quantity is assumed given?