Difference in GRULM implementation and LSTM

Ming_Wei_H · September 30, 2023, 12:22am

I wanted to compare C3_W3 assignment (LSTMs) and C3_W2 assignment (GRUs). With the GRULM
model, the GRU unit was repeated for 2 units.

and in the LSTM assignment the LSTM layer only had 1 unit

One has 2 units, and the other only has 1. I’m curious about why the number of units differs. Additionally, shouldn’t the number of units be at least as many as max length of input sentence? Why are there only 1 and 2 units for the models above? I’m basing both the GRU and LSTM model off of vanilla RNN which should look like this model, where each unit is responsible for 1 input word:

arvyzukai · October 1, 2023, 11:48am

Hi @Ming_Wei_H

There is a common mistake of understanding what is a “layer” what is a “unit”. So, in GRULM there are 2 “layers” of GRU with 512 “units” each. While in the LSTM there is 1 “layer” with 50 “units”.
The terminology is confusing and if you want to find more about it, you can read this post, if it’s confusing - don’t worry.

Another common misconception is that RNN’s number of units is dependent on sentence length. That is not true. RNN’s number of units is the number of how many inputs and outputs does the RNN layer receives and outputs. In the GRULM case, the inputs are 512 dimensional vectors and the outputs are also 512 dimensional vectors (each “unit” produces its own output). So, the number of units is just the number of outputs you want from the layer.

I recently answered a similar question where you can find concrete calculations which might be informative or confusing

Cheers

Topic		Replies	Views
Why are we allowed to choose the number of units of an LSTM layer? Sequences, Time Series and Prediction week-module-3	4	563	March 22, 2022
C4W1: Quick question - Number of LSTM units in the model NLP with Attention Models week-module-1	1	415	March 2, 2024
GRU unit for RNN Sequence Models coursera-platform	1	524	June 2, 2022
Questions on inputs for GRU model NLP with Sequence Models week-module-2	5	749	March 9, 2023
GRU assignment n_layers argument NLP with Sequence Models week-module-2	4	595	July 18, 2022

Difference in GRULM implementation and LSTM

Related topics