How does R(s) reward an action (e.g. firing engines) which is not part of the state?

rmwkwok · August 18, 2022, 1:10am

Hello Michael @mosofsky,

We need to remind ourselves that reward is solely decided by the environment and not subject to limitations of any theory or equation. The lunar lander environment is such a case that it decides to reward negative points to engine firing. This code line decides that the lunar lander’s reward is calculated by how the state changes, and the next few lines are about the rewards for engine firing. Actually many of the items in the list of rewards are actions or change of states, but not a state.

We can’t change the environment, but how to model reward or how to use reward is a question for us to consider. Did you check out how reward is being used in the assignment? This is a relevant discussion.

Raymond

Topic		Replies	Views
Definition of Reward Unsupervised Learning, Recommenders, Reinforcement week-3	4	650	October 22, 2022
Confusion about some aspects of reinforcement learning Unsupervised Learning, Recommenders, Reinforcement week-3	1	499	August 22, 2022
Lunar lander reward Unsupervised Learning, Recommenders, Reinforcement week-3	10	322	November 12, 2023
Question on discounting Unsupervised Learning, Recommenders, Reinforcement week-3	8	481	November 7, 2022
C3_W3_A1_Assignment: Reward using the .step() function Unsupervised Learning, Recommenders, Reinforcement week-3	5	570	November 1, 2022

How does R(s) reward an action (e.g. firing engines) which is not part of the state?

Related topics