Target Network Clarification

Anbu · May 5, 2023, 5:15pm

Hi Mentor,

Because of the y target value getting change on every iteration we came up with the concept of separate network called Target network. Our doubt is,

It will be problem right because for every C time steps, if we copy main network values to target network then target Y wont be remains constant right. Target Y still be getting changes

Also due to copying process, Qnew = Q(s,a) so MSE will be always be lower right on every iterations. It should be not be the case right sir.

Anbu · May 6, 2023, 7:58am

can someone please explain target network ?

canxkoz · May 6, 2023, 10:16am

Hello @Anbu ,
Thanks a lot for the question. I will do my best to help you understand what a target network is in my reply.

The target network is a separate neural network that is used to estimate the target values for the Q-learning update rule. It is a copy of the main network, but its parameters are updated less frequently, which helps stabilize the learning process.

Using a single neural network for both estimating the current Q-values and updating the target Q-values can lead to instability in the learning process. This is because the network’s parameters are constantly changing, causing the target values to shift as well. To address this issue, the concept of a target network is introduced.

The target network is a separate neural network that is periodically updated with the parameters of the main Q-network. This means that the target values used for the Q-learning update rule remain more stable, allowing for a more stable learning process. For example, consider a reinforcement learning problem where an agent is learning to navigate a maze. The agent uses a Q-network to estimate the Q-values for each possible action in its current state. To update the Q-values, the agent also needs to estimate the target Q-values for the next state. Instead of using the same Q-network for this purpose, the agent uses a separate target network, which is updated less frequently. This helps stabilize the learning process and allows the agent to learn more effectively.

In summary, a target network is a separate neural network used in deep reinforcement learning algorithms to stabilize the learning process. It is a copy of the main Q-network, but its parameters are updated less frequently, providing more stable target values for the Q-learning update rule.

Please feel free to post a followup question if you feel uncertain about what a target network is.
Best,
Can

JJaassoonn · July 10, 2023, 6:34am

Dear Mr Can Koz,

Could you please help me to elaborate more on this statement by using an example of Lunar Lander?

I am confused with this statement, especially the keyword “instability” and “constantly changing”.

Thank you.

Topic		Replies	Views
Don't fully understand q_network and target_q_network Unsupervised Learning, Recommenders, Reinforcement week-3	4	386	August 28, 2023
Confusion on Target Variable Deep Reinforcement Unsupervised Learning, Recommenders, Reinforcement week-3	28	937	September 15, 2022
Moving Y target? Unsupervised Learning, Recommenders, Reinforcement week-3	1	229	May 15, 2024
Don't understand why we use q_netword & target_q_network Unsupervised Learning, Recommenders, Reinforcement week-3	1	357	September 19, 2023
Unsupervised Learning : Week3 : Learning the state-value function Unsupervised Learning, Recommenders, Reinforcement week-3	7	473	November 3, 2023

Target Network Clarification

Related topics