Why map X -> Y and use supervised learning when making an example of RL

Mohd_Farhan_Hassan · October 2, 2023, 12:51pm

If reinforcement learning works on reward based system which is all we wanted to do with the lunar lander, why do we need to max X ( current state and action ) with Y (rewards)? What would learning the parameters of this mapping provide us exactly?

Riya_Parikh17 · October 2, 2023, 7:07pm

Hello @Mohd_Farhan_Hassan thanks for posting your query, could you please specify the video from which you are getting query so it may help us to solve your doubt in more better way

Topic		Replies	Views
What helps the Neural Network in the Lunar Lander example improve? Unsupervised Learning, Recommenders, Reinforcement week-module-3	2	297	June 6, 2024
Confusion about some aspects of reinforcement learning Unsupervised Learning, Recommenders, Reinforcement week-module-3	1	500	August 22, 2022
Definition of Reward Unsupervised Learning, Recommenders, Reinforcement week-module-3	4	666	October 22, 2022
Lunar Lander - Reinforcement Learning - Q Function Algo Unsupervised Learning, Recommenders, Reinforcement week-module-3	1	454	July 14, 2023
Confused about how DQN works Unsupervised Learning, Recommenders, Reinforcement week-module-3	10	336	February 21, 2024

Why map X -> Y and use supervised learning when making an example of RL

Related topics