W1 LSTM Network

Ari_M · September 24, 2021, 3:35pm

Need pointers to debug the code:

{moderator edit - solution code removed}

Gives:
16
—> 17 a_next_tmp, c_next_tmp, yt_tmp, cache_tmp = lstm_cell_forward(xt_tmp, a_prev_tmp, c_prev_tmp, parameters_tmp)
18
19 print(“a_next[4] = \n”, a_next_tmp[4])

in lstm_cell_forward(xt, a_prev, c_prev, parameters)
56 it = sigmoid(np.dot(Wi, concat) + bi)
57 cct = np.tanh(np.dot(Wc, concat) + bc)
—> 58 c_next = np.dot(ft, c_prev) + np.dot(it, cct)
59 ot = sigmoid(np.dot(Wo, concat) + bo)
60 a_next = np.dot(ot, np.tanh(c_next))

<array_function internals> in dot(*args, **kwargs)

ValueError: shapes (5,10) and (5,10) not aligned: 10 (dim 1) != 5 (dim 0)

Thanks!

Kic · September 25, 2021, 10:16am

Hi @Ari_M

The problem here is due to the use of np.dot(), the inner product multiplication, which is very different from the elementwise multiplication of two matrices.

The error message gave details on why it is wrong, because dim1 of the first matrix has 10 elements and dim 0 of the second matrix has only 5 elements. For inner product multiplication, dim1 of the first matrix and dim 0 of the second matrix have to be the same.

Jonathan_Lugo · September 25, 2021, 11:26am

Transposing the tahn of the c_next’s result fixes the issue, as it aligns the dimensions. HOWEVER, it does solve the problem as the cell does not pass one of the asserts

a_next = np.dot(ot, np.tanh(c_next))

to

a_next = np.dot(ot, np.tanh(c_next.).T)

New Error

---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-18-7dee818208b9> in <module>
     27 
     28 # UNIT TEST
---> 29 lstm_cell_forward_test(lstm_cell_forward)

~/work/W1A1/public_tests.py in lstm_cell_forward_test(target)
    111     assert cache[1].shape == (n_a, m), f"Wrong shape for cache[1](c_next). {cache[1].shape} != {(n_a, m)}"
    112     assert cache[7].shape == (n_a, m), f"Wrong shape for cache[7](ot). {cache[7].shape} != {(n_a, m)}"
--> 113     assert cache[0].shape == (n_a, m), f"Wrong shape for cache[0](a_next). {cache[0].shape} != {(n_a, m)}"
    114     assert cache[8].shape == (n_x, m), f"Wrong shape for cache[8](xt). {cache[8].shape} != {(n_x, m)}"
    115     assert cache[2].shape == (n_a, m), f"Wrong shape for cache[2](a_prev). {cache[2].shape} != {(n_a, m)}"

AssertionError: Wrong shape for cache[0](a_next). (7, 7) != (7, 8)

Kic · September 25, 2021, 2:51pm

Hi @Jonathan_Lugo

The result from inner product multiplication is different from that of elementwise multiplication. Why do you use np.dot()? why not just use elementwise multiplication?

Ari_M · September 25, 2021, 3:48pm

Thanks for pointing the problems with c_next and a_next statements!

Jonathan_Lugo · September 25, 2021, 5:04pm

Hi @Kic , thanks for pointing that out. I just found the following in the hint, that is why I used it, but now I am not sure:

Use numpy.dot for matrix multiplication.

At this point I am not so sure when to use which in the assignment

(sorry @Ari_M , I don’t want to hijack your post …)

paulinpaloalto · September 25, 2021, 5:59pm

The key thing to realize is the notational convention that Prof Ng has consistently used throughout all 5 of these courses:

When he means “elementwise” multiplication, he always and only uses * as the operator.

When he means “dot product” style matrix multiplication, he just writes the two operands adjacent to one another with no explicit operator. It’s been this way consistently since the very beginning of Course 1.

With that in mind, look at the mathematical expressions given in the instructions for this section. It’s all right there for you to see.

There are quite a few dot products, which is why they give the hint about np.dot, although you’d think by Course 5 such hints would be considered almost insulting. But there are also a couple of instances of elementwise multiply in the formulas for c^{<t>} and a^{<t>}. So it’s the same story as always: you need to know what it is you are trying to say mathematically. Only when you clearly understand that, then you can write the python code to “make it so”.

Ari_M · September 25, 2021, 6:35pm

Thanks! The c_next and a_next statements should use the operator * instead of np.dot.

Ari_M · September 25, 2021, 6:42pm

I appreciate your advice! I completed courses 1 and 2 quite a while ago, and then straight to course 5, that’s why the confusion. Thanks for your help on the basics!

Topic		Replies	Views
Course 5 Week 1 Assignment 1 lstm_cell_forward Sequence Models	2	570	June 28, 2022
Course 5 week 1 - lstm_cell_forward Sequence Models	8	974	July 19, 2021
Week3 E1 ValueError: shapes (5,10) and (5,10) not aligned: 10 (dim 1) != 5 (dim 0) Sequence Models week-1	2	301	March 9, 2024
Trouble with lstm_cell_backward function Sequence Models week-1	4	113	May 20, 2024
W1 - Exercise 3 - trouble with matrix sizes Sequence Models week-1	2	19	August 30, 2024

W1 LSTM Network

Related topics