Context Length and exponential increase in RNNs Memory (Parameters)

Arun_Prakash_A · June 30, 2023, 1:02pm

In general, RNN is unrolled for infinite context length (in principle) independent of number of parameters. However in the lecture vide titled " Text generation before Transformer", the instructor says otherwise.
How does the number of parameters in RNN increase exponentially with increase in context window size? (or) did I misunderstand the statement?

gent.spah · June 30, 2023, 1:35pm

Right, how many parameters does 1 RNN have? And also remember the RNN has a memory that might also be bidirectional, so if you increase the context the memory needs be longer, more cells of the RNN will provide a contribution to the prediction.

Arun_Prakash_A · June 30, 2023, 6:05pm

Thanks, I got your point. I still feel there is a gray area in my understanding. Let’s suppose we have one layer comprised of two independent RNNs (M Parameters each) to learn the context in both directions. Typically we get the contextualized representation for words from RNN models by training the mode to predict next/previous token given the history, say, P(t_k|t_{1},t_{2},..,t_{k-1}). For example, ELMO.

So I could understand that the STORAGE memory increases to store hidden state vectors of each time step (later used for BPTT or for attention, concatenation…) of RNN. However, the parameters of the network remain same as they are shared across time steps, right?. In the lecture, the term “memory” is loosely used to refer to the “parameters” of a model.

Edit:
Since the instructor used the phrase “compute and memory” requirements grow exponentially, I presume he is referring to the storage memory.

Topic		Replies	Views
Relationship between batch size and context window on the amount of memory that's needed for the model AI Discussions ai-discussions	3	200	March 1, 2024
RNN Memory Duration Sequence Models week-module-1 , coursera-platform	4	99	May 20, 2024
Understanding number of parameters in an RNN Sequence Models coursera-platform	2	650	February 27, 2023
How to compute RNN parameters? Sequence Models coursera-platform	3	1481	July 18, 2024
Why do we use the same parameters for different timestamps in RNN? Convolutional Neural Networks coursera-platform	2	524	October 5, 2021

Context Length and exponential increase in RNNs Memory (Parameters)

Related topics