Is there a mistake in the "Learning the state-value function"?

Goran_Hrzenjak · February 7, 2023, 11:41am

Hi,

in the video Learning the state-value function, at 10:30, there is following definition of y:
.

On the left side it says:

y(1) = R (s(1) ) + gamma * max Q ( s’(1), a’ )
y(2) = R( s(2) ) + gamma * max Q ( s’(2), a’ )

Why the a’ doesn’t have an index?
I would expect:

y(1) = R (s(1) ) + gamma * max Q ( s’(1), a’(1) )
y(2) = R( s(2) ) + gamma * max Q ( s’(2), a’(2) )

Thanks

rmwkwok · February 8, 2023, 3:52am

Hello @Goran_Hrzenjak,

No it is not a mistake. The a’ sign has to be read together with another a’:

Together the whole thing means choosing the a' that maximizes Q(s'^{(1)}, a'). At the time we compute y^{(1)}, we are only at the stage trying to determine what the best a' should be. Therefore, we use a' there to represent that it is any value of a that maximizes the Q.

Raymond

Goran_Hrzenjak · February 12, 2023, 11:16am

I understand. Thank you very much.

rmwkwok · February 13, 2023, 2:11am

You are welcome, @Goran_Hrzenjak!

Topic		Replies	Views
Can 't see how these are equal in the application of Bellman's eq'n in the Learning the state-value function lecture Unsupervised Learning, Recommenders, Reinforcement week-3	2	15	February 8, 2025
State-action Value Function - Video Unsupervised Learning, Recommenders, Reinforcement week-3	8	523	March 7, 2023
Week 3 lecture video has error _ State_action Value Function definition Unsupervised Learning, Recommenders, Reinforcement week-3	5	450	August 13, 2024
Question about state value function learning algo Unsupervised Learning, Recommenders, Reinforcement week-3	4	520	April 19, 2023
Unsupervised Learning : Week3 : Learning the state-value function Unsupervised Learning, Recommenders, Reinforcement week-3	7	473	November 3, 2023

Is there a mistake in the "Learning the state-value function"?

Related topics