Course 4 Week 1 Assignment 1 Excercise 3 LSTM_Cell_forward

in lstm_cell_forward function the the first call to function is passing but giving me shape error in Unit Test

FT: (5, 10)
IT: (5, 10)
NA 5, 10
np.dot(ft, c_prev.T) + np.dot(it, cct.T): (5, 10), (5, 10), (5, 10), (5, 10)
C_NEXT: (10, 10)
OT: (5, 10)
A_NEXT: (5, 10)
Y PRED: (2, 10)
a_next[4] =
[ 4.81716089 -1.13710276 2.9134558 -3.78599029 -4.12777948 0.73148448
5.61350373 -1.62439911 -0.79094036 4.61924777]
a_next.shape = (5, 10)
c_next[2] =
[ 1.10813743 0.15779456 0.73337312 -1.07886044 0.20533492 -1.13419987
0.53119246 -1.96479545 0.36945115 2.7411064 ]
c_next.shape = (10, 10)
yt[1] = [1.42575110e-11 9.65637542e-01 1.10909027e-08 1.00000000e+00
1.00000000e+00 6.51188392e-02 1.20492595e-14 9.99969715e-01
7.10569633e-01 4.18065550e-12]
yt.shape = (2, 10)
cache[1][3] =
[ 1.29525389 0.43648659 1.04067727 -1.3684517 0.13513058 -0.70822327
2.05123227 -1.50921226 1.01509763 1.80309553]
len(cache) = 10
FT: (7, 8)
IT: (7, 8)
NA 7, 8
np.dot(ft, c_prev.T) + np.dot(it, cct.T): (7, 8), (7, 8), (7, 8), (7, 8)
C_NEXT: (8, 8)
OT: (7, 8)
A_NEXT: (7, 8)
Y PRED: (3, 8)


AssertionError Traceback (most recent call last)
in
27
28 # UNIT TEST
—> 29 lstm_cell_forward_test(lstm_cell_forward)

~/work/W1A1/public_tests.py in lstm_cell_forward_test(target)
112 assert cache[5].shape == (n_a, m), f"Wrong shape for cache5. {cache[5].shape} != {(n_a, m)}"
113 assert cache[6].shape == (n_a, m), f"Wrong shape for cache6. {cache[6].shape} != {(n_a, m)}"
→ 114 assert cache[1].shape == (n_a, m), f"Wrong shape for cache1. {cache[1].shape} != {(n_a, m)}"
115 assert cache[7].shape == (n_a, m), f"Wrong shape for cache7. {cache[7].shape} != {(n_a, m)}"
116 assert cache[0].shape == (n_a, m), f"Wrong shape for cache0. {cache[0].shape} != {(n_a, m)}"

AssertionError: Wrong shape for cache1. (8, 8) != (7, 8)

In the formula:
image

The products are element-wise products, not dot products.

@TMosh Thanks a ton, I have been stuck here since last 4 hour.

The clue to diagnosing this sort of issue is to consider that the result of a dot product always going to be a matrix of a different size than either of the inputs (neglecting the case of square matrices).

Since this line of code computes the next cell state, and it is based on products using the current cell state, it is likely that the shapes should remain the same.

In addition to Tom’s points about the shapes, it’s worth noting that Prof Ng is very consistent in his notation:

For “elementwise” products, he always and only will use the operator "*" in the math formula.

For dot products, he always writes the two operands adjacent with no explicit operator.

It’s been that way since day 1 of DLS C1. Here’s a thread with a more complete discussion of the general point and also a discussion of how “broadcasting” fits in.

1 Like