Could someone help me understand this part?

Why is Y a moving target? From my understanding Y’ is also a moving target since the weights apply to both of them.

And how is this solved by the Target Q network?

Could someone help me understand this part?

Why is Y a moving target? From my understanding Y’ is also a moving target since the weights apply to both of them.

And how is this solved by the Target Q network?

1 Like

Hi @jjdniz

Both Y and Y′ are moving targets because they depend on the current parameters of the Q-network, which are updated during training. In training, the target Q-network uses a separate network with frozen parameters to estimate Y'.