What helps the Neural Network in the Lunar Lander example improve?

RealOmarKhalil · March 9, 2024, 8:49pm

Hi Everyone,
I was watching videos about reinforcement learning. Specifically, I was watching this video " Learning the state-value function" where Andrew explains a basic algorithm that can be used to train lunar-lander. He mentioned that it MAY work and it still needs improvement, yet I can not wrap my head around it. My main question: how does the neural network in this algorithm know that it is getting closer to the right answer or farther from it? What helps it improve over time?

Riya_Parikh17 · March 15, 2024, 1:39pm

Hello @RealOmarKhalil
As explained in this video you are using Bellman equation which is giving maximizing action for that particular state, using this data only you are again training your neural network to get another state action value and as the main purpose of Bellman’s equation is providing maximizing action for a particular state and action you will finally land up to the correct state function value

Rabelais · June 6, 2024, 1:19am

Hi,

I was also having trouble here but I think the answer is that the Reward function R(s) used in the Bellman equation is deterministic i.e. you already know exactly how to calculate the reward. If the reward function is unclear, then you may need a neural network with training data to do supervised learning to discover that reward function.

I hope that makes sense … 3 months later

Topic		Replies	Views
Confused about how DQN works Unsupervised Learning, Recommenders, Reinforcement week-3	10	334	February 21, 2024
Confusion about some aspects of reinforcement learning Unsupervised Learning, Recommenders, Reinforcement week-3	1	500	August 22, 2022
State and Action as Input vs State as Input and Q Values as Output Unsupervised Learning, Recommenders, Reinforcement week-3	2	286	March 17, 2024
Why map X -> Y and use supervised learning when making an example of RL Unsupervised Learning, Recommenders, Reinforcement week-3	1	324	October 2, 2023
How does the neural network compute the Q function Unsupervised Learning, Recommenders, Reinforcement week-3	3	494	March 21, 2023

What helps the Neural Network in the Lunar Lander example improve?

Related topics