# Help for C5W1A1

For this function,

def lstm_backward(da, caches):

this part:

# Compute all gradients using lstm_cell_backward. Choose wisely the “da_next” (same as done for Ex 6).

When I do like this:
da[:, :, t], dc_prevt, caches[t]
then wrong values come
when I do like this:
da[:, :, t] + da_prevt, dc_prevt, caches[t]
then error:
ValueError: operands could not be broadcast together with shapes (5,10) (8,10)
I noticed, by printing, that before calling the function, shape of da_prevt is (5, 10) but after calling, the shape becomes (8, 10)
The previous function (lstm_cell_backward) works fine. All values print properly,

Computation of `da_prevt` is incorrect inside `lstm_cell_backward`.

Expected:

Actual:

Here’s a hint from the markdown for the exercise:

where the weights for equation 21 are from n_a to the end, (i.e. W_f = W_f[:,n_a:] etc…)

Another hint:
Consider only till `:na` in the 2nd dimension when computing `da_prevt`

2 Likes

Ah that worked! Thanks a lot @balaji.ambresh for the valuable suggestion.

A correction though (I mention for the purpose of future classmates who might get stuck): the problem lied not with the lstm_cell_backward (I had done that correctly), but with lstm_backward.
Hopefully, I can paste a snippet to show what went wrong.

1. While calling the function in the for loop, the first argument I supplied was da[:, :, t] instead of da[:, :, t] + da_prevt, so that was one point of fault. This could be understood based on Exercise 6.
2. As you pointed out, I had indexed `da_prevt` wrongly. However, one point I’d like to clarify is that this variable is in the lstm_backward function and not lstm_cell_backward function. That caused a bit of confusion for me.
And, this is how I coded for da_prevt in the for loop: gradients[‘da_prev’] instead of gradients[‘da_prev’][:n_a]. The latter is the correct version.

Thanks

1 Like