C3W2_Vanishing_Gradients : def prod(k)

In the function implementation of prod(k), should the term W_hh@h[:,i] be W_hh@h[:i-1]?

Hello, I’m not sure if I’m looking at an older version of the course. Vanishing gradients seems to in C2W3 for me, and the lab doesn’t have a function named prod(k). Can you share a code snippet or gist on git?

Hi, here is a snapshot:

Hello @rocki, sorry for the delayed response. Your interpretation is correct - based on the formula in the notebook:

h_i=\sigma(W_{hh}\mathbf{h_{i-1}}+W_{hx}x_i+b_h)

or \frac{\partial h_i}{\partial h_{i-1}}=W_{hh}^Tdiag(\sigma'(W_{hh}\mathbf{h_{i-1}} + W_{hx}x_i + b_h))

I will raise an issue on Github for the course staff to act on.