Error in lstm_backward

{moderator edit - solution code removed}

ERROR is :
ValueError Traceback (most recent call last)
in
18
19 da_tmp = np.random.randn(5, 10, 4)
—> 20 gradients_tmp = lstm_backward(da_tmp, caches_tmp)
21
22 print(“gradients["dx"][1][2] =”, gradients_tmp[“dx”][1][2])

in lstm_backward(da, caches)
56
57 # Fix the dimensions of dx[:,:,t] to match gradients["dxt"]
—> 58 dx[:, : ,t] = gradients[“dxt”]
59
60 dWf += gradients[“dWf”]

ValueError: could not broadcast input array from shape (5,3) into shape (3,10)

Hi @AkibButt

In your code, gradients["dxt"] has a shape of (5, 3) and dx[:, :, t] expects a shape of (3, 10). This is causing a dimension mismatch.

Make sure that the input and the forward/backward pass dimensions match, especially in the initialization of dx to match gradients["dxt"]. You can also print the shapes and other intermediate values before assignment to find the issue.

Hope it helps!

The code you show looks correct. Maybe the problem is actually in lstm_cell_backward. It looks like the gradient for dxt is the wrong shape. So how could that happen and why didn’t the test case for lstm_cell_backward catch it?

I added some print statements in both functions and here’s what I see when I run that test case:

dxt.shape (3, 10)
shape of dx[:,:,t] (3, 10)
dxt.shape (3, 10)
shape of dx[:,:,t] (3, 10)
dxt.shape (3, 10)
shape of dx[:,:,t] (3, 10)
dxt.shape (3, 10)
shape of dx[:,:,t] (3, 10)