In the error analysis process described in the lecture ( minute ~4:20)

we’re supposed to calculate and compare the conditional probabilities P(y|x) for y_hat and for y*.

The calculation should be done with the RNN network. But I am not sure I understand how?

y is not given to the RNN , the RNN needs to figure it out. How can we calculate a probability for a given output y?