-
in the lecture around time index [9:00], Prof says " max over all actions, a’ of Q (s’ ^(1)),
that’s a state you got to in this example …" and then circles s’ ^(2) in blue
My question:
- Why is the max over all actions, a’ of Q (s’ ^(1)) = s’^(2) state in the next example ?
Thanks!