Confusion about some aspects of reinforcement learning

for example, in the lunar lander example, the rewards are landing perfectly, swaying towards or away from the landing pad, firing thrusters, crashing etc… My question is that in a given state how can the lunar lander or the computer automatically realize that the lunar lander has successfully landed/crashed/ or any other number of things (mentioned in the rewards) and therefore give reward accordingly.?

It would help me a great deal especially because I can’t seem to find the answer and I’m stuck for days :slight_smile:

The lunar lander example is a part of the OpenAI Gym, so you might google in that direction, and see if there is any better documentation. If you would like to read code, you might check this out and try to understand the mechanism behind.