C5_W3_A1 Setting "return_sequence" or "return_state"

Litwi · December 13, 2023, 3:18pm

Hi,

I have completed successfully the first programming assignment from week 3, but I’m trying to fully understand the concepts before moving on.

The main thing I’m not sure if I understood correctly is the parameters “return_sequence” and “return_state” of the LSTM layers.

My current understanding is:

We set “return_sequence” to True in the pre-attention LSTM so that we get all a_t’ to calculate all the attention weights (alpha)
In the post-attention LSTM, we set “return_state” to true so that we can get the cell state and leave “return_sequence” as false because we only need one output to calculate one y_pred at a time, instead of calculating them all together.

Are my understandings correct?
And why do we need the cell state if we never use it in the model?

TMosh · December 13, 2023, 6:12pm

From the Keras documentation for the LSTM layer.

I suspect the cell state was used during testing of the notebook by the developers.

Topic		Replies	Views
When calling TensorFlow LSTM layer instance, why retrieve hidden state and cell state from first and third element of returns？ Sequence Models	2	569	September 24, 2021
Explanation of LSTM_cell call Sequence Models	6	677	May 3, 2023
Course 5, week 2, assignment 2 Sequence Models	2	285	December 8, 2023
W 2 A2 \| Emojify_v2 model Sequence Models	8	840	October 13, 2022
Emojify_V2 LSTM return_sequences Argument Sequence Models week-2	4	132	October 21, 2024