Tried to make a Deep Q-learning script from scratch using tensorflow

rlucasfm · May 27, 2024, 1:47pm

I have just finished the Machine Learning specialization and was pretty interested in Reinforcement Learning. As to get a better grasp on the concepts and try to apply it on practice, I tried to write my version of DQN, but I’m not so sure that I got it right, even though it’s “working”.

It’s the “double_dqn_tf.py” script on this repo:

Alireza_Saei · May 28, 2024, 6:57am

Hey there,

Your code seems solid, and if it’s working, that’s great! However, you might reconsider some parts of your code for better performance (e.g. epsilon rate decay, using better data structures, Hyper-parameter tuning, etc.)

rlucasfm · May 28, 2024, 3:12pm

thank you for your answear!

About working, well… not so good. I’ve made a lot of enhancements, specially about performance. Now i’m approaching the training with a vectorized way, and handling the Experience Replay without array functions like “pop()”, which gave-me A LOT more performance, but I still can’t make it converge and have a satisfatory result.

I just can’t find out what’s wrong

github.com

rlucasfm/Q-learning-tf/blob/master/double_dqn_tf_v2.py

import gym.logger
import numpy as np
import gym
import setuptools.dist
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Activation, Input
from tensorflow.keras.optimizers import Adam
from matplotlib import pyplot as plt
import time
from timeit import default_timer as timer

class DDQNAgent:
    def __init__(self, state_size, action_size):
        # Hyperparameters and misc
        self.n_actions = action_size
        self.lr = 0.05
        self.gamma = 0.99
        self.tau = 0.001

This file has been truncated. show original

Alireza_Saei · May 29, 2024, 7:24am

Hey there,

Great progress so far! First of all, consider adding current_state = current_state.reshape(1, -1) at the beginning of your compute_action function to make the dimension correct.

For optimizing hyperparameters, pay attention to epsilon and its decay rate. You can experiment with various values to find the best combination. You might also explore alternative formulas or strategies for epsilon decay.

That’s all I can think of for now! Let me know if you need further assistance or feedback.

rlucasfm · May 29, 2024, 2:40pm

I’ve been twerking on some stuff , and could make it have a better performance after doubling the episodes for training. Since that I’m trying to change the Neural Network architecture and play with more episodes to see what happens.

I will try your tip on the next run for sure! Thank you for the answears!!

Alireza_Saei · May 30, 2024, 8:30am

You’re welcome, feel free update us here so we can learn from your experiences, too!

Topic		Replies	Views
Deep Q-Learning Algorithm with Experience Replay Unsupervised Learning, Recommenders, Reinforcement week-module-3	1	512	November 6, 2022
Deep Q nework AI Discussions ai-discussions , project	6	202	February 14, 2024
Using Deep Q Network for Game Unsupervised Learning, Recommenders, Reinforcement week-module-3	1	497	November 30, 2022
How to implement the RL solution as image given below in tensorflow? Unsupervised Learning, Recommenders, Reinforcement week-module-2	1	472	February 3, 2023
Week 3 programming c2 Unsupervised Learning, Recommenders, Reinforcement week-module-3	3	424	June 28, 2023

Tried to make a Deep Q-learning script from scratch using tensorflow

Related topics