The Return in Reinforcement Learning

I am unable to understand return in Reinforcement Learning. I am little bit confused as the return will return value when the mars rover come to the beginning position or
It is the value when the rover is going towards the reward rover will get return as discount

The return is a way for you to compare one approach to another to decide which is better. That means you’re looking at all the steps it takes for you to get from the beginning position to the terminal state.

One way to calculate a return would be to just sum up all the rewards at each state as you step through, but we want to encourage getting to the terminal state quicker, which is why we include a discount factor for each step along the way.

With this in mind, try watching the video again. I think it will click for you.

1 Like

Thanks, It’s quite helpful to me.