Help needed: Breakout using DeepQ learning fails to converge in a reasonable timeframe

rmwkwok · September 15, 2025, 2:15am

Restarted yesterday night, and the improvement of running average Reward looks accelerating now:

My CPU is giving me 100,000 timesteps per hour, so it’s going to take 4 days to run 10M steps as stated in the paper, but I will see how it goes and decide…

Cheers,
Raymond

Topic		Replies	Views
Confusion on Target Variable Deep Reinforcement Unsupervised Learning, Recommenders, Reinforcement week-module-3	28	996	September 15, 2022
Having trouble understanding how DQN converges Unsupervised Learning, Recommenders, Reinforcement week-module-5 , coursera-platform	0	14	December 14, 2025
Deep Q nework AI Discussions ai-discussions , project	6	216	February 14, 2024
Learning the Q Function Unsupervised Learning, Recommenders, Reinforcement week-module-3	16	588	July 13, 2023
Reinforcement learning - deep q learning - lunar lander AI Discussions ai-discussions	0	32	July 15, 2025

Help needed: Breakout using DeepQ learning fails to converge in a reasonable timeframe

Related topics