Hello @Akilesh_Ramalingam, the soft update to the TQN (Target Q Network) works by using the QN, so first of all, the QN needs to be trained and retained throughout the learning process. Also the purpose of the TQN is to keep QN more stable. For more relevant discussions and experiments, I strongly suggest you to spend 10-20 minutes to read through the thread from this point onwards or from the beginning of the thread.
Cheers,
Raymond