Can 't see how these are equal in the application of Bellman's eq'n in the Learning the state-value function lecture

lkj · February 7, 2025, 7:28pm

Unsupervised Learning, Recommenders, Reinforcement Learning> Week 3 >Learning the state-value function
in the lecture around time index [9:00], Prof says " max over all actions, a’ of Q (s’ ^(1)),
that’s a state you got to in this example …" and then circles s’ ^(2) in blue

My question:

Why is the max over all actions, a’ of Q (s’ ^(1)) = s’^(2) state in the next example ?

Thanks!

lkj · February 7, 2025, 7:59pm

assuming he just meant s’ ^(1) in the first experience and s’ ^(2) in the 2nd experience …

rmwkwok · February 8, 2025, 11:17am

Hello, @lkj,

Andrew was going through at that time, so I think he meant to circle s'^{(1)} instead of s'^{(2)}. I will open a ticket for the course team to follow up on this.

Cheers,
Raymond

Topic		Replies	Views
What is the difference between "State action value function" and "Bellman Equation"? Unsupervised Learning, Recommenders, Reinforcement week-module-3	6	616	February 20, 2023
Is there a mistake in the "Learning the state-value function"? Unsupervised Learning, Recommenders, Reinforcement week-module-3	3	499	February 13, 2023
I think there is an error or is my understanding off Unsupervised Learning, Recommenders, Reinforcement week-module-3	9	614	February 25, 2023
State-action value function example? Unsupervised Learning, Recommenders, Reinforcement week-module-3	8	627	September 9, 2022
Quiz problem in bellman Unsupervised Learning, Recommenders, Reinforcement week-module-3	1	485	March 17, 2023

Can 't see how these are equal in the application of Bellman's eq'n in the *Learning the state-value function* lecture

Related topics

Can 't see how these are equal in the application of Bellman's eq'n in the Learning the state-value function lecture