Lstm_backward wrong results

Hi All,

lstm_backward function gives me wrong outputs even though everything else passed all tests. I checked everything many times and all looks correct. how do I investigate this? I also checked all forum topics and saw that people had similar issues, but I don’t think I have the same problem. I would appreciate any recommendations.

my outputs:

gradients["dx"][1][2] = [ 0.00131172  0.09282475 -0.5368476  -0.43281115]
gradients["dx"].shape = (3, 10, 4)
gradients["da0"][2][3] = -0.04194056823977163
gradients["da0"].shape = (5, 10)
gradients["dWf"][3][1] = -0.08879746073385465
gradients["dWf"].shape = (5, 8)
gradients["dWi"][1][2] = 0.10522688299053953
gradients["dWi"].shape = (5, 8)
gradients["dWc"][3][1] = -0.09660358355637418
gradients["dWc"].shape = (5, 8)
gradients["dWo"][1][2] = 0.028542627871270824
gradients["dWo"].shape = (5, 8)
gradients["dbf"][4] = [-0.02202898]
gradients["dbf"].shape = (5, 1)
gradients["dbi"][4] = [-0.14664914]
gradients["dbi"].shape = (5, 1)
gradients["dbc"][4] = [-0.34655979]
gradients["dbc"].shape = (5, 1)
gradients["dbo"][4] = [-0.23460769]
gradients["dbo"].shape = (5, 1)

Expected Output:

gradients["dx"][1][2] =	[0.00218254 0.28205375 -0.48292508 -0.43281115]
gradients["dx"].shape =	(3, 10, 4)
gradients["da0"][2][3] =	0.312770310257
gradients["da0"].shape =	(5, 10)
gradients["dWf"][3][1] =	-0.0809802310938
gradients["dWf"].shape =	(5, 8)
gradients["dWi"][1][2] =	0.40512433093
gradients["dWi"].shape =	(5, 8)
gradients["dWc"][3][1] =	-0.0793746735512
gradients["dWc"].shape =	(5, 8)
gradients["dWo"][1][2] =	0.038948775763
gradients["dWo"].shape =	(5, 8)
gradients["dbf"][4] =	[-0.15745657]
gradients["dbf"].shape =	(5, 1)
gradients["dbi"][4] =	[-0.50848333]
gradients["dbi"].shape =	(5, 1)
gradients["dbc"][4] =	[-0.42510818]
gradients["dbc"].shape =	(5, 1)
gradients["dbo"][4] =	[ -0.17958196]
gradients["dbo"].shape =	(5, 1)

Hey @OlegSlesarev,
Welcome, and we are glad that you could become a part of our community :partying_face: By any check, did you check out this thread, and the fixes suggested in this one?

Cheers,
Elemento

thank you, I mentioned that I checked all I could find on the topic in forums, there are plenty of people having similar issue but their outputs are different from mine and also I don’t think I have issues they have. One person calculates dc_prev incorrectly - I have correct calc. Another person incorrectly set c_next = c[:,:,0] in forward propagation - I correctly initialised it. I checked all topics on this forum and looked through all lstm functions carefully, and they all pass tests except Lstm_backward. I obviously missed something but not sure what is the best way to figure it out. Some people mentioned that the problem may be in other functions even though they pass all tests, so it is actually pretty tricky to understand where the problem is.

Hey @OlegSlesarev,
Thanks a lot for the follow-up.

Cheers,
Elemento

Hey @OlegSlesarev,
Your issue lies in the implementation of lstm_cell_backward. Note that as per your code;

gradients[“dc_prev”][2][3] = 0.7930239983691791

But the expected output for this is;

gradients[“dc_prev”][2][3] = 0.797522038797

An error in the third decimal in not a rounding-off error, but an error in your implementation. In order to correct this, you need to correct the equation for dc_prev in your implementation of lstm_cell_backward function. I hope this resolves your error.

Cheers,
Elemento

Awesome, thank you so much Elemento! That was super helpful and I have now found the error and now my notebook fully works - I see all outputs as expected! Thanks a lot for your help!