Here Andrew said that at terminal state the value will be same for both actions, i.e 100 and 100 in state 1 and 40 and 40 in state 6.

But what if my action is to the **right** from state 1 (or to the **left** from state 6) . Wont the equation be:

R(s) + 0.5 max(Q(sā,aā)) = 100 + 0.5 * 50 = 125?

I am little confused. Please correct me where I am doing wrong.