Why are we allowed to choose the number of units of an LSTM layer?

Jaime_Gonzalez · March 18, 2022, 11:49am

In this video (https://www.coursera.org/learn/tensorflow-sequences-time-series-and-prediction/lecture/8u3wq/coding-lstms) the instructor decides to use 32 units in his LSTM layer.

But in a previous video (https://www.coursera.org/learn/tensorflow-sequences-time-series-and-prediction/lecture/fP3ND/shape-of-the-inputs-to-the-rnn) he shows this:

image from which we can infer that the number of LSTM units should be the same as the length of the input sequence (so if your input sequence is composed of 4 numbers, say [1, 5, 6, 8] then your LSTM layer should have x0, x1, x2, x3, x4, i.e.: 4 LSTM units).

Note that I am assuming each one of these is what is referred to as one LSTM unit, please correct me if I’m wrong:

So in the first video mentioned, if the input length is 20 (as the window size 20), why would he use 32 LSTM units and not 20? What are the inputs to the other 12 remaining units after x0 - x20 have been input?

Chris.X · March 21, 2022, 1:45pm

I think it is up to the developer to decide how many units are available. It’s a good habit to use 2**n to decide. At the same time, it should not be too big, so it is easy to overfit, but also not too small, so it may underfit.

The output of each time step (here each cell on the graph) is still determined by a Dense activation, sigmoid, or softmax (you know what I mean).

Hopefull, help

Jaime_Gonzalez · March 21, 2022, 3:56pm

My question is: why are the number of units not exactly equal to the number of timesteps? According to the image, that is what it should always be

Chris.X · March 21, 2022, 4:48pm

My opinion is. that there is a confusion of concepts here.

units in LSTM, or RNN, refers to the dimensionality of the output of the layer. This number is decided by the developer, I have already given what to pay attention to, so I won’t repeat it here.
I guess the units you are referring to are: the number of times the LSTM layer will be called repeatedly.
Using your example.
input A batch [[1, 5, 6, 8]], then the units you are referring to should be 4, i.e. the same layer will be run 4 times over and over again.

If the second one I guessed wrong, then we should go back to the first point, that is, the definition of units has nothing to do with the size of steps.

Hopefull, help

Jaime_Gonzalez · March 22, 2022, 4:03pm

Thank you @Chris.X, this helped a lot

My idea of what an LSTM unit was was wrong.

I now know LSTM units does not refer to each one of these:

An LSTM unit refers to each one of these circles:

(Taken from: https://www.coursera.org/learn/nlp-sequence-models/lecture/ftkzt/recurrent-neural-network-model)

Topic		Replies	Views
C4W1: Quick question - Number of LSTM units in the model NLP with Attention Models week-1	1	415	March 2, 2024
Number of inputs to LSTM Sequence Models coursera-platform	3	463	May 27, 2023
Difference in GRULM implementation and LSTM NLP with Sequence Models week-3	1	434	October 1, 2023
Number of LSTM units in Trax NLP with Sequence Models week-3	12	1297	January 12, 2023
Course 5 Week 2 Assignment 2 Emojify_V2 Sequence Models coursera-platform	2	687	June 30, 2022

Why are we allowed to choose the number of units of an LSTM layer?

Related topics