Lunar Lander Reward Function

From the video lecture week 3 lunar lander, can you please clarify the below statement ?

The designers of the lunar lander application actually put some thought into exactly what behavior you want and codified it in the reward function. To incentivize more of the behaviors you want and fewer of the behaviors like crashing that you don’t want.

Does it means postive reward we need to listed down a lot and negative rewards down listed few ?

Hello @Anbu,

It has nothing to do with how many items of positive reward we need or how many items of negative reward we need. There is no preference to which kind of items should be more.

It means to think about what behavior you don’t want to see, and what you want to see. For those you don’t want, you give appropriate negative rewards, and for those you want, you give appropriate positive rewards. That’s it and no need to think about which has more.

Cheers,
Raymond