C3W3 assignment: what is soft update?

Hi all, I’ve finished the course… while trying to understand the training algorithms for RL, I came across the soft update step which transfers the weights from the q model to the target model after back prop. I’m perplexed by the formulae used for doing this. As the video lecture made no mention of this formulae, could someone please advice me ? why is not just a simple full transfer of weights from q network to target network but instead uses this funny TAU? here’s the function which does the soft update in utils.py:
def update_target_network(q_network, target_q_network):
for target_weights, q_net_weights in zip(target_q_network.weights, q_network.weights):
target_weights.assign(TAU * q_net_weights + (1.0 - TAU) * target_weights)

TAU us defined as:
TAU = 1e-3 # soft update parameter

I don’t quite understand why it’s not just a full transfer of the weights over to target… please advice me… merci beaucoup :smiley:

Hello John @john_lonewolf,

I suggest you to read this discussion regarding the purpose of having the target Q network, and if you have more time, further down the thread to the experiment done by Michael that compares the Q-network with the Target Q-network.

Cheers,
Raymond