Why q_network is used instead of target_q_network for inference in C3_W3_A1_Assignment?

The soft update is made to target_q_network, then why q_network is saved and used for inference?

Hello @Akilesh_Ramalingam, the soft update to the TQN (Target Q Network) works by using the QN, so first of all, the QN needs to be trained and retained throughout the learning process. Also the purpose of the TQN is to keep QN more stable. For more relevant discussions and experiments, I strongly suggest you to spend 10-20 minutes to read through the thread from this point onwards or from the beginning of the thread.