Where did the reward for the initial state come from?

The notebook output shows

Reward Received: 1.1043263227541047

But the section describing all the rewards doesn’t include one for doing nothing in the initial state. Are there more rewards than are documented in the notebook?

Hello @toontalk ,

Great question, so in OpenAI’s Gym documentation, its mentioned that:

Reward for moving from the top of the screen to the landing pad and coming to rest is about 100-140 points.

So this reward received is for moving that small amount and getting a little bit closer to the landing pad.

Hope this helps,


If the 100-140 point reward is doled out in pieces for incremental progress then the documentation should be updated to say so and explain how this is calculated. This is important.

Also it is unclear under what circumstances the reward is one of the values between 100 and 140.

I think this is specific to Open AI’s Gym. I would recommend contacting them directly for the reward implementation details.

Hope this helps,


Good idea. It seems the answer is that it is complicated: