So i understood that we can backpropagate from loss through soft max to a_next which can be added to da_next, but in final timestep of the model as there is no c_next how dc_next is calculated is it initiated with zeroes or we take the connection between a_next and c_next?
In the last block where i feel i inputed correct code the result is not matching with the expected output.
Please click on my name to start a private message. Then, attach your notebook as a .ipynb file. Please note that mentors cannot access your Coursera Jupyter workspace, so sending the notebook in a .ipynb format is essential.
Please read this section of the markdown for lstm_forward and fix c_next
Initialize c^{\langle t \rangle} with zeros.
- The variable name is c_next
- c^{\langle t \rangle} represents a single time step, so its shape is (n_{a}, m)
- Note: create c_next as its own variable with its own location in memory. Do not initialize it as a slice of the 3D tensor c. In other words, don’t do c_next = c[:,:,0]
The recommendation you have provided refers to the other exercise of the assignment which was working fine beforehand…and the mismatch is happening in ex-8
The checks in the notebook are not exhaustive. So, just because your implementation of lstm_forward passes the test code that follows it, that doesn’t mean that lstm_forward is perfect.
The test for exercise 8 invokes lstm_forward before running lstm_backward. Do you have a good reason for not following my instructions?