I am confused about the W1A1 “Building your Recurrent Neural Network - Step by Step”, 3.2 - LSTM Backward Pass. In the two backward functions

def lstm_cell_backward(da_next, dc_next, cache):

and def lstm_backward(da, caches),

I did not see the gradient of Wy and by, i.e., (dWy and dby), which I believe should be updated. I checked out other resources and did see the backpropagation equation for dWy and dby (LSTM Back-Propagation Derivation | Kartik Shenoy | Medium). Did anybody also have this confusion? If I missed anything on the notebook, please let me know and I really appreciate it.

They explained in the beginning of the back prop section that they aren’t really covering the full path here. Here’s the relevant quote:

`Note that this notebook does not implement the backward path from the Loss 'J' backwards to 'a'. This would have included the dense layer and softmax, which are a part of the forward path. This is assumed to be calculated elsewhere and the result passed to `

rnn_backward` in 'da'. It is further assumed that loss has been adjusted for batch size (m) and division by the number of examples is not required here.`

That applies to both the RNN and LSTM sections.