for example, in the lunar lander example, the rewards are landing perfectly, swaying towards or away from the landing pad, firing thrusters, crashing etc… My question is that in a given state how can the lunar lander or the computer automatically realize that the lunar lander has successfully landed/crashed/ or any other number of things (mentioned in the rewards) and therefore give reward accordingly.?
It would help me a great deal especially because I can’t seem to find the answer and I’m stuck for days