To close the loop on the public thread, it turns out that the back prop code was correct and that the actual error was in the forward prop code in lstm_forward
. That function was used in the test cell for lstm_backward
.
The mistake in lstm_forward
was to initialize c_next
like this:
c_next = c[:,:,0]
They specifically warn you against doing that in the instructions, because it makes c_next
a reference to the same memory object as c which does not end well.
One of the “gotchas” to watch out for in python is that objects (e.g. numpy arrays or dictionaries) are passed “by reference” on function calls. So you have to be very careful when you write to an object in a function: you may be modifying a global object. Here’s a thread which discusses some other cases in which that is a problem and shows other examples to watch out for.