Week 1 - GRU, Why is hidden state and cell memory always same

Abhishek_Jain6 · January 17, 2024, 6:03am

Hi Team,

I realise that as per the lecture provided on GRU, the memory cell is always set equal to the hidden state. The idea that the memory cell “Retains” a characteristic value over the sequence should result in a constant hidden state over the sequence. Then, how would the model predict different output for each element?

balaji.ambresh · January 17, 2024, 6:37am

Please specify the lecture / timestamp.

Abhishek_Jain6 · January 17, 2024, 7:06am

TIme Stamp : 7:59

balaji.ambresh · January 17, 2024, 8:28am

Output at each timestep is dependent not just on the cell states across time but also on input at that timestep. With this in mind, please explain why the output should be the same for each timestep.

paulinpaloalto · January 17, 2024, 3:53pm

Yes, the whole point of an RNN is that the “cell state” (also called “hidden state” or “memory” state) changes at each timestep. How it changes based on the inputs (which are both the x^{<t>} and the a^{<t-1>} values) is determined by the parameters (weights) that are learned during training. Of course with the GRU and LSTM architectures, that “cell state” gets more complex, but the high level point is the same: it changes with every timestep.

It’s been a couple of years since I listened to the lectures in DLS C5 W1, but I’m sure that Prof Ng discusses this. As in the feed forward nets in C1 and the ConvNets in C4, we can’t say with certainty exactly what the hidden state is doing and how it is doing it, but the idea is that it remembers things like “we’ve seen the subject of the sentence” and “the subject was plural” and uses that to determine later behavior. The behavior is learned based on the training data. There was a lecture in DLS C4 called “What are Deep ConvNets Learning” that describes some really interesting research in which they instrument the neurons in hidden layers of a ConvNet to see how they work and what they “see”, but I don’t know if anyone has done similar research with RNNs.

Abhishek_Jain6 · January 18, 2024, 6:41am

Thanks Ambresh and Paulin, Your response really helps. Correct me if I am wrong; say the memory cell vector stores the information of the ‘cat’ being a singular entity in one of the vector elements(say c[cat]). Now as the model progresses to identifying new words, the vector element c[cat] is doesn’t change but the other values in the memory cell change over time and help in generating new outputs.
Is my inference correct?

paulinpaloalto · January 18, 2024, 5:05pm

I don’t think we can make the kind of literal interpretation that you are proposing, but something like what you suggest may well happen. E.g. the fact that the subject of the sentence has been seen, which word it was in the sequence and whether it was singular or plural would not change in general as you process the later words in the sentence. The point is we don’t really know if there is a single “bit” or element of the hidden state that encodes each of those conditions that I described. You would need to do the kind of instrumentation research that was described in that ConvNet lecture I pointed out.

It might well be the case that there are some elements of the hidden state that change only once through the timesteps and then remain constant. But the overall point is that there are some things that very likely change at every timestep. But even there, you could imagine a case in which the current timestep is a word that is like a “stop word”, but which didn’t get pruned in the pre-processing for some reason. E.g. it normally has semantic effect, but there can be situations in which it is a semantic NOP.

Abhishek_Jain6 · January 20, 2024, 10:58am

Thanks Paulin, got it.

Topic		Replies	Views
Sequence Models Week 1 Quiz Sequence Models coursera-platform	15	742	December 4, 2024
Question about GRU Sequence Models coursera-platform	1	419	July 22, 2023
Grokking LSTM and GRUs some questions (Week 1 and 2) NLP with Sequence Models week-1 , week-2	2	49	September 24, 2024
Concept behind gates Sequence Models coursera-platform	15	565	December 7, 2022
GRU Memory Cell Encoding Sequence Models coursera-platform	3	539	November 25, 2022

Week 1 - GRU, Why is hidden state and cell memory always same

Related topics