Could someone help me understand this part?
Why is Y a moving target? From my understanding Y’ is also a moving target since the weights apply to both of them.
And how is this solved by the Target Q network?
Could someone help me understand this part?
Why is Y a moving target? From my understanding Y’ is also a moving target since the weights apply to both of them.
And how is this solved by the Target Q network?
Hi @jjdniz
Both Y and Y′ are moving targets because they depend on the current parameters of the Q-network, which are updated during training. In training, the target Q-network uses a separate network with frozen parameters to estimate Y'.