Two things are unclear to me after the video on GRU unit:
- When using GRU units, does the whole RNN consist of GRU units? If not, where do GRU units appear?
- If the activation produced by GRU is the same as the memory, i.e. a=c, then the next unit might have no information about the previous word in the sentence, which makes no sense. What am I missing?