Bellman Equation

waze · August 30, 2022, 10:38am

Here Andrew said that at terminal state the value will be same for both actions, i.e 100 and 100 in state 1 and 40 and 40 in state 6.

But what if my action is to the right from state 1 (or to the left from state 6) . Wont the equation be:

R(s) + 0.5 max(Q(s’,a’)) = 100 + 0.5 * 50 = 125?

I am little confused. Please correct me where I am doing wrong.

rmwkwok · August 30, 2022, 1:37pm

Hello @waze, the requirement is that no further action will be taken once the robot is at a terminal state.

waze · August 30, 2022, 3:14pm

Oh yeah, I forgot that point. Thank you.

rmwkwok · August 31, 2022, 12:03am

No problem @waze!

Topic		Replies	Views
Quiz problem in bellman Unsupervised Learning, Recommenders, Reinforcement week-3	1	471	March 17, 2023
State action value function for terminal states Unsupervised Learning, Recommenders, Reinforcement week-3	9	409	September 21, 2024
Diagrammatic Representation of Bellman Equation Unsupervised Learning, Recommenders, Reinforcement week-3	1	558	August 12, 2023
Unsupervised Learning: Bellman Equation example looks incorrect Unsupervised Learning, Recommenders, Reinforcement week-3	4	76	September 22, 2024
State-action value function example? Unsupervised Learning, Recommenders, Reinforcement week-3	8	592	September 9, 2022