Why are we allowed to choose the number of units of an LSTM layer?

In this video (https://www.coursera.org/learn/tensorflow-sequences-time-series-and-prediction/lecture/8u3wq/coding-lstms) the instructor decides to use 32 units in his LSTM layer.

But in a previous video (https://www.coursera.org/learn/tensorflow-sequences-time-series-and-prediction/lecture/fP3ND/shape-of-the-inputs-to-the-rnn) he shows this:

image from which we can infer that the number of LSTM units should be the same as the length of the input sequence (so if your input sequence is composed of 4 numbers, say [1, 5, 6, 8] then your LSTM layer should have x0, x1, x2, x3, x4, i.e.: 4 LSTM units).

Note that I am assuming each one of these is what is referred to as one LSTM unit, please correct me if I’m wrong:

So in the first video mentioned, if the input length is 20 (as the window size 20), why would he use 32 LSTM units and not 20? What are the inputs to the other 12 remaining units after x0 - x20 have been input?

I think it is up to the developer to decide how many units are available. It’s a good habit to use 2**n to decide. At the same time, it should not be too big, so it is easy to overfit, but also not too small, so it may underfit.

The output of each time step (here each cell on the graph) is still determined by a Dense activation, sigmoid, or softmax (you know what I mean).

Hopefull, help :blush:

My question is: why are the number of units not exactly equal to the number of timesteps? According to the image, that is what it should always be

My opinion is. that there is a confusion of concepts here.

  1. units in LSTM, or RNN, refers to the dimensionality of the output of the layer. This number is decided by the developer, :point_up_2: I have already given what to pay attention to, so I won’t repeat it here.

  2. I guess the units you are referring to are: the number of times the LSTM layer will be called repeatedly.
    Using your example.
    input A batch [[1, 5, 6, 8]], then the units you are referring to should be 4, i.e. the same layer will be run 4 times over and over again.

If the second one I guessed wrong, then we should go back to the first point, that is, the definition of units has nothing to do with the size of steps.

Hopefull, help :blush:

Thank you @Chris.X, this helped a lot

My idea of what an LSTM unit was was wrong.

I now know LSTM units does not refer to each one of these:

An LSTM unit refers to each one of these circles:

(Taken from: https://www.coursera.org/learn/nlp-sequence-models/lecture/ftkzt/recurrent-neural-network-model)