C3_W3 Quiz (State-action value function) Question 2

The question reads:

You are controlling a robot that has 3 actions: ← (left), → (right) and STOP. From a given state s, you have computed Q(s,) = -10, Q(s,) =-20, Q(s, STOP ) = 0.

What is the optimal action to take in state s?

I see what it’s asking and have no problem with the answer. However, I do think we can specify what “optimal” means here, more specifically, whether we are optimizing the reward of one action or the total return.

Hello Joe,

Would you think differently after reading this again, that Q is defined to be covering “after that” too?

Cheers,
Raymond

2 Likes

Thanks providing the material.

I suppose they’re consistent in using the actually useful explanation for “optimal”.

1 Like