Problem of the final lab

wgl · February 12, 2023, 5:02pm

why is this need，i mean why should the last cost function be different
please help me

SamReiswig · February 12, 2023, 6:10pm

The cost function is the Bellman Equation and it’s being used because we are estimating the optimal action-value function.

This is explained in section 6 - Deep Q-Learning of the lab.

Sam

wgl · February 13, 2023, 1:34am

thanks，but i am still comfuse about why if episode terminate at j+1 and yj = Rj but not yj=
Rj+rmaxQ

TMosh · February 14, 2023, 4:12am

I think all it’s saying is that if you’re at the last state, there’s no “j+1” possible, so you ignore it.

Topic		Replies	Views
Inconsistent definition for the Bellman equations Unsupervised Learning, Recommenders, Reinforcement week-3	10	730	August 18, 2022
Question about state value function learning algo Unsupervised Learning, Recommenders, Reinforcement week-3	4	520	April 19, 2023
Unsupervised Learning: Bellman Equation example looks incorrect Unsupervised Learning, Recommenders, Reinforcement week-3	4	80	September 22, 2024
Notes on the Bellman equation Unsupervised Learning, Recommenders, Reinforcement week-3	1	56	October 15, 2024
Quiz problem in bellman Unsupervised Learning, Recommenders, Reinforcement week-3	1	471	March 17, 2023