GRU assignment n_layers argument

Longyu_Zhao · July 18, 2022, 6:08am

In C3_W2 assignment, there is an API tl.GRU(n_units) and we want to repeat this GRU for n_layers (default 2). My question is that this seems that there are 2 GRU units connected in serial, but the max_length of the sentence is 64 thus way bigger than 2. In my understanding, the GRU network takes one token (here character) at each GRU unit. Does this mean that the network can only take the first 2 characters as inputs and ignore the rest 62 characters? How can all 64 characters fit in this GRU network?

Thanks!

balaji.ambresh · July 18, 2022, 6:13am

Does this help?

Longyu_Zhao · July 18, 2022, 5:12pm

Thanks for the prompt reply. I think I am getting it now. Since all the GRU cells share the same weight, any number (in this case max_length) of GRU cells can be connected in serial to form a GRU layer. At the end of the day, we are not increasing trainable parameters because they all share the same weights and biases. In the vertical direction, multiple GRU layers (so called deep GRU) can be used (default n_layer is 2), and that will increase the trainable parameters.

arvyzukai · July 18, 2022, 5:40pm

Yes, you got it right

balaji.ambresh · July 18, 2022, 6:01pm

Just a little detail. Each GRU cell, even within a layer, has its own weighs and biases. Since we use the same GRU cells across all timesteps, you can say that cell configuration is shared across timesteps.

The number of trainable parameters will change as you change the units of the GRU layer but will not change based on number of timesteps of data that’s input to the GRU layer.

The link given to you has time along x-axis and stacked cells along y-axis.

Topic		Replies	Views
Difference in GRULM implementation and LSTM NLP with Sequence Models week-3	1	434	October 1, 2023
Model architecture: Embedding dimension size and GRU number of cells NLP with Sequence Models week-2	8	1148	January 3, 2023
C3W1_RNNs How exactly are the GRU layers connected to each other NLP with Sequence Models week-1	3	245	April 5, 2024
C3W1 Vanilla RNN and GRU lab doubt NLP with Sequence Models week-1	2	26	August 17, 2024
Coding concerning stacking GRUs NLP with Sequence Models week-2	5	613	January 16, 2023

GRU assignment n_layers argument

Related topics