W1_A1_E3_lstm_cell_forward

Hi everyone, I am having some problems with this exercise and I am not sure the tests are correctly implemented. (from a matrix size pov). I will attach, the function, the tests and the error as well (the error make sense for me):

Function

{moderator edit - solution code removed}

Test code

np.random.seed(1)
xt_tmp = np.random.randn(3, 10)
a_prev_tmp = np.random.randn(5, 10)
c_prev_tmp = np.random.randn(5, 10)
parameters_tmp = {}
parameters_tmp[‘Wf’] = np.random.randn(5, 5 + 3)
parameters_tmp[‘bf’] = np.random.randn(5, 1)
parameters_tmp[‘Wi’] = np.random.randn(5, 5 + 3)
parameters_tmp[‘bi’] = np.random.randn(5, 1)
parameters_tmp[‘Wo’] = np.random.randn(5, 5 + 3)
parameters_tmp[‘bo’] = np.random.randn(5, 1)
parameters_tmp[‘Wc’] = np.random.randn(5, 5 + 3)
parameters_tmp[‘bc’] = np.random.randn(5, 1)
parameters_tmp[‘Wy’] = np.random.randn(2, 5)
parameters_tmp[‘by’] = np.random.randn(2, 1)

a_next_tmp, c_next_tmp, yt_tmp, cache_tmp = lstm_cell_forward(xt_tmp, a_prev_tmp, c_prev_tmp, parameters_tmp)

print(“a_next[4] = \n”, a_next_tmp[4])
print("a_next.shape = ", a_next_tmp.shape)
print(“c_next[2] = \n”, c_next_tmp[2])
print("c_next.shape = ", c_next_tmp.shape)
print(“yt[1] =”, yt_tmp[1])
print("yt.shape = ", yt_tmp.shape)
print(“cache[1][3] =\n”, cache_tmp[1][3])
print("len(cache) = ", len(cache_tmp))

UNIT TEST

lstm_cell_forward_test(lstm_cell_forward)

Error


ValueError Traceback (most recent call last)
in
15 parameters_tmp[‘by’] = np.random.randn(2, 1)
16
—> 17 a_next_tmp, c_next_tmp, yt_tmp, cache_tmp = lstm_cell_forward(xt_tmp, a_prev_tmp, c_prev_tmp, parameters_tmp)
18
19 print(“a_next[4] = \n”, a_next_tmp[4])

in lstm_cell_forward(xt, a_prev, c_prev, parameters)
57 it = sigmoid(np.dot(Wi, concat) + bi)
58 cct = np.tanh(np.dot(Wc, concat) + bc)
—> 59 c_next = np.dot(it, cct) + np.dot(ft, c_prev)
60 ot = sigmoid(np.dot(Wo, concat) + bo)
61 a_next = np.dot(ot, np.tanh(c_next))

<array_function internals> in dot(*args, **kwargs)

ValueError: shapes (5,10) and (5,10) not aligned: 10 (dim 1) != 5 (dim 0)

Can you please help me understanding it? From a lin algebra pov it’s clear, but I think the issue is in how the test is implemented.

It worked for me. :smile:

Most likely you are misinterpreting the math formulas. Be careful to remember the convention that Prof Ng always uses: if he means “elementwise” multiply, he always and only uses “*” to indicate that. If he writes the two operands adjacent with no explicit operator, then and only then is the operation “real” matrix multiply (dot product style). Here’s a thread which discusses this in more detail.

Here’s the math formula shown in the instructions for the line of code that is “throwing”:

1 Like