Reward calculation

How can it be 6.25?
It should be 5 right?
Please solve the issue.

Hi @Arijit_Goswami

That’s the way it will take like this photo
as when the discount factor gamma is 0.5 it will start to the right and after that it will go left as it is the best way(behave optimally) to reach to 100 so it Q(3,right)= 6.25 as it is shown in the photo


