IN the lecture GRU, we didnt get clear picture of the below statement. Can someone please help to understand it ?
You can choose to keep some bits constant while updating other bits. For example, maybe you’ll use one-bit to remember the singular or plural cat, and maybe you’ll use some other bits to realize that you’re talking about food. Because we talked about eating and talk about foods, then you’d expect to talk about whether the cat is full later. You can use different bits and change only a subset of the bits at every point in time
Doubt 1: What does it mean change only a subset of the bits at every point in time ?
Doubt 2: We are updating hidden state values of the memory cell then if so what is the relation between change bits and update activation dimension values ?
Maybe the important high level point is that what Prof Ng is describing here is how the hidden state can be used to remember different aspects of what is going on. When he says “you can choose”, what he means is that “back propagation can learn …”. We don’t manually design the way the state bits are used by the algorithm: we just have to guess roughly how many bits are required based on the complexity we need to handle and then the algorithm learns through training.
- In a sequence model, the state is changing potentially at every time step based on the combination of the input state from the previous timestep and what actually happened at the current timestep. E.g. what word we see at that point in the sequence. The point is what happens at each timestep may only change some of the state bits, not necessarily all of them.
- Changing the hidden state values does not change the shape of anything. Any given bit can change from 0 to 1 or vice versa, but the number of bits does not change.
Thank You Sir. Below is my intuition
sentence: The cat which already ate food … which was full.
Assume cat having bit info 1 and the word food having different bits. When the model sees the word food, the model should update its bit information in to the memory cell state while keeping cat bit information constant(no update) .
Because we are keeping cat bit constant inorder to predict the word was . We need to update the word Food into memory cell state inorder to predict the cat full or not full after ate.
Is My intuition make sense sir ? Actually bit means here is it the word vector ?