Don't fully understand q_network and target_q_network

Jinyan_Liu · August 27, 2023, 11:09pm

I don’t fully understand q_network and target_q_network.

It says we don’t want y target to change on every iteration. But target_q_network’s weights are updated on every iteration. So y target is changing on every iteration? Should it change or not?
On every iteration, a gradient descent is performs on q_network’s weights, and a soft update is performed on target_q_network’s weights. Does it mean q_network’s weights are updated more sharply on every iteration, but target_q_network’s weights are updated more softly on every iteration? Then we use softly updated target_q_network to calculate target y, and sharply updated q_network to train?

rmwkwok · August 28, 2023, 12:49am

Hello @Jinyan_Liu,

It changes whenever the target Q Network gets updated, so it does change.

Rather than saying “To avoid this”, I think it is more proper to say “To avoid it from changing too fast”.

Yes!

Yes, softly updated TQN to get target y, for training the sharply updated QN.

Cheers,
Raymond

Jinyan_Liu · August 28, 2023, 12:58pm

Thanks for your confirmation!
I think now I get it!

Today I completed the Machine Learning Specialization! I want to thank you @rmwkwok ! You replied quickly and answered questions in detailed explanations and taught me how to sometimes find answers by trying things out myself(e.g., how to understand numpy doc). Among mentors, you helped me the most! I am sure many students feel the same way! Thank you for all the support!

rmwkwok · August 28, 2023, 1:00pm

Congratulations on your achievement, @Jinyan_Liu! I hope I will see you again here in the future!

Cheers,
Raymond

Jinyan_Liu · August 28, 2023, 1:01pm

Thanks! We will see!

Topic		Replies	Views
Target Network Clarification Unsupervised Learning, Recommenders, Reinforcement week-3	3	752	July 10, 2023
Don't understand why we use q_netword & target_q_network Unsupervised Learning, Recommenders, Reinforcement week-3	1	357	September 19, 2023
Moving Y target? Unsupervised Learning, Recommenders, Reinforcement week-3	1	229	May 15, 2024
Why q_network is used instead of target_q_network for inference in C3_W3_A1_Assignment? Unsupervised Learning, Recommenders, Reinforcement week-3	1	520	September 7, 2022
C3W3 assignment: what is soft update? Unsupervised Learning, Recommenders, Reinforcement week-3	1	538	August 30, 2022

Don't fully understand q_network and target_q_network

Related topics