How to approach real life problems with reinforced learning

in week 3 for RL, Andrew mentioned, that reinforced learning applies well on simulated environments but can be very hard on e.g., real robots.
He also mentioned and showed a video of the autonomous helicopter at Standford University which was flying upside down and which was learning that by itself through RL.

From the Lunar Landar example we have seen that RL basically means, that you let your agent explore the environment over and over again and from those observations learn the best behavior. This makes sense in a simulated environment. But how would you do that with a real helicopter as shown by Andrew? Certainly, you cannot just try out 10 crashes to see and learn from that.

I am not looking for a very detailed answer but more for a general approach to understand more the specifics, limitations and solving steps of a RL problem.

Typically the damage from uncontrolled learning could be mitigated by safety measures like using tethers or catch nets.

Also, it’s very common for reinforcement learning training to happen entirely in a simulated physics environment, rather than in the real world. That’s probably cheaper than risk the expensive flying robot hardware doing thousands of training episodes.

1 Like