Week3, State action value module. Tie breakers for returns!

Hari_Krishnan_94 · June 21, 2025, 2:31am

Hi all,
I was playing with the lab to see how actions are getting altered with respect to reward and discount factor(I chose 0.5). How does one choose the action when there’s a tie in return? Here in the below image, it takes left. Is it because there’s a logic that checks for least number of steps if there’s a tie?
How do we typically deal with tie breakers? Do we add any rules based on physics of the problem that we are dealing with?

Thanks,
Hari

conscell · June 21, 2025, 2:39am

Hi @Hari_Krishnan_94,

Many RL libraries or environments loop through the action list in a fixed order and pick the first one with the max value. If you suspect that it’s choosing the action that leads to fewer steps, then you can include additional logic like shortest path or minimal cost-to-go if the environment allows that information.

TMosh · June 21, 2025, 6:35am

No.

If it’s a tie, it doesn’t matter which you chose.

Topic		Replies	Views
Reinforcement Learning - The State-action value function MLS Resources	2	310	August 28, 2023
Week 3 lecture video has error _ State_action Value Function definition Unsupervised Learning, Recommenders, Reinforcement week-3	5	450	August 13, 2024
Question on discounting Unsupervised Learning, Recommenders, Reinforcement week-3	8	482	November 7, 2022
Reinforcement - Terminology of "first step" Unsupervised Learning, Recommenders, Reinforcement week-3	5	323	December 8, 2023
Error in State-action value quiz Unsupervised Learning, Recommenders, Reinforcement week-3	8	511	June 11, 2024

Week3, State action value module. Tie breakers for returns!

Related topics