C3W3 assignment: what is soft update?

john_lonewolf · August 29, 2022, 5:46pm

Hi all, I’ve finished the course… while trying to understand the training algorithms for RL, I came across the soft update step which transfers the weights from the q model to the target model after back prop. I’m perplexed by the formulae used for doing this. As the video lecture made no mention of this formulae, could someone please advice me ? why is not just a simple full transfer of weights from q network to target network but instead uses this funny TAU? here’s the function which does the soft update in utils.py:
def update_target_network(q_network, target_q_network):
for target_weights, q_net_weights in zip(target_q_network.weights, q_network.weights):
target_weights.assign(TAU * q_net_weights + (1.0 - TAU) * target_weights)

TAU us defined as:
TAU = 1e-3 # soft update parameter

I don’t quite understand why it’s not just a full transfer of the weights over to target… please advice me… merci beaucoup

rmwkwok · August 30, 2022, 3:14am

Hello John @john_lonewolf,

I suggest you to read this discussion regarding the purpose of having the target Q network, and if you have more time, further down the thread to the experiment done by Michael that compares the Q-network with the Target Q-network.

Cheers,
Raymond

Topic		Replies	Views
Don't fully understand q_network and target_q_network Unsupervised Learning, Recommenders, Reinforcement week-3	4	386	August 28, 2023
Don't understand why we use q_netword & target_q_network Unsupervised Learning, Recommenders, Reinforcement week-3	1	357	September 19, 2023
Not so clear the concreate difference between soft update and normal update Unsupervised Learning, Recommenders, Reinforcement week-3	19	462	July 3, 2023
Why q_network is used instead of target_q_network for inference in C3_W3_A1_Assignment? Unsupervised Learning, Recommenders, Reinforcement week-3	1	521	September 7, 2022
Soft update is the same as Adjusting Alpha? Unsupervised Learning, Recommenders, Reinforcement week-3	7	504	October 19, 2022

C3W3 assignment: what is soft update?

Related topics