Language modelling with an RNN

Hi, I’m watching now the first week’s lectures and have a question about x(t) = y(t-1). Correct me if I’m wrong, but it seems the meaning is that the input of a word is the output of the previous word.
But if we take the example of the sentence “Cats average 15 hours of sleep a day”, the input of x<3> is stated as the word “average” (y<2>), but shouldn’t the input be all the words from the beginning of the sentence, that is “Cats average”? After all, the step of y<3> in this example is calcualting the probability of any word in the dictaionary, givמ what just came before was “Cats average”, not just “Cats”.
So - what is the real input?

The context of the words encountered up to but not including the current timestep is reflected in the internal and carry states of the RNN.

It’d help to see the assignment for this week where you’ll implement both the forward and backward passes.