Bellman Equation

Here Andrew said that at terminal state the value will be same for both actions, i.e 100 and 100 in state 1 and 40 and 40 in state 6.

But what if my action is to the right from state 1 (or to the left from state 6) . Wont the equation be:

R(s) + 0.5 max(Q(sā€™,aā€™)) = 100 + 0.5 * 50 = 125?

I am little confused. Please correct me where I am doing wrong.

Hello @waze, the requirement is that no further action will be taken once the robot is at a terminal state.

1 Like

Oh yeah, I forgot that point. Thank you.

No problem @waze! :slight_smile: