Need help with implementing RL lab code for a continuous action space

Shivang_Chauhan · September 17, 2022, 4:50am

Hello guys, I successfully applied the current lab for MountainCar Gym project - Mountain Car - Gym Documentation

As the current code considers that the action space will always be Discrete
Screenshot 2022-09-17 at 10.17.22 AM

I would like to apply the same code to the problems where the action space is like this -
Screenshot 2022-09-17 at 10.18.09 AM

I changed the current code a bit and somehow made it render the first action but the compute_loss function is causing the issue as it is made for a Discrete example, can someone please help me adjust the current code for this project?

https://www.gymlibrary.dev/environments/box2d/bipedal_walker/

Elemento · September 18, 2022, 11:11am

Hey @Shivang_Chauhan,
Welcome to the community. Till now, I have only worked with finite number of states and finite number of actions when it comes to RL, and this assignment was pretty much the first exercise that I solved myself dealing with infinite state space. However, as far as my understanding goes, you won’t be able to use the algorithm, i.e., Deep Q-Learning (at least the way it is implemented in the assignment) for en environment having both, an infinite state space and an infinite action space.

This can be trivially understood from the way the neural networks (NN) have been created and trained. Given a state, NN predicts the Q-value for the next state, corresponding to all the possible actions, and during the update, we simply took the maximum across these predicted values. However, if your environment has an infinite action space, your NN must essentially predict the Q-value for the next state corresponding to an infinite number of actions, which isn’t possible (at least according to the current implementation).

However, there indeed are ways to deal with environments having an infinite state space and an infinite action space. Please read more about function approximation and the various algorithms that come under the umbrella of function approximation. I hope this helps.

Cheers,
Elemento

Topic		Replies	Views
Max Q(s',a') for continuous state spaces Unsupervised Learning, Recommenders, Reinforcement week-module-3	5	495	April 14, 2023
DQN vs Q-Function Unsupervised Learning, Recommenders, Reinforcement week-module-3	6	555	August 8, 2022
Quiz: Continuous state spaces Unsupervised Learning, Recommenders, Reinforcement week-module-3	3	524	July 30, 2022
Problem of the final lab Unsupervised Learning, Recommenders, Reinforcement week-module-3	3	502	February 14, 2023
State-action value function example? Unsupervised Learning, Recommenders, Reinforcement week-module-3	8	607	September 9, 2022

Need help with implementing RL lab code for a continuous action space

Related topics