Hi,

I have completed successfully the first programming assignment from week 3, but I’m trying to fully understand the concepts before moving on.

The main thing I’m not sure if I understood correctly is the parameters “return_sequence” and “return_state” of the LSTM layers.

My current understanding is:

- We set “return_sequence” to True in the pre-attention LSTM so that we get all a_t’ to calculate all the attention weights (alpha)
- In the post-attention LSTM, we set “return_state” to true so that we can get the cell state and leave “return_sequence” as false because we only need one output to calculate one y_pred at a time, instead of calculating them all together.

Are my understandings correct?

And why do we need the cell state if we never use it in the model?