Question about apply_gradients

Marcia_Ma · May 24, 2023, 9:31pm

Hi, I was wondering what does optimizer.apply_gradients(zip(grads, trainable_variables)) do? And what is its relationship with grads = tape.gradient(minibatch_total_loss, trainable_variables) Thanks!

rmwkwok · May 24, 2023, 10:59pm

Hi @Marcia_Ma,

I will share a brief introduction but leave the rest of the exploration about this topic to you. You will need to experiment them yourself to fully understand what’s going on there.

Recall that in our vanilla gradient descent, we have this weight update formula:

whereas in RMSProp, we have the following instead:

Note that in both formalism, we always need to compute dw, or \frac{\partial{J}}{\partial{w}}. dw=tape.gradient(J, w) does that computation for us.

After computing dw, we have to apply it in the weight update formula and then update the weight w, optimizer.apply_gradients([(w, dw), ]) is responsible for this process. Note that we keep (dw, w) in a pair by wrapping them into a tuple so that the program knows dw is the gradient with respect to w.

I recommend you to read this doc page for more on auto differentiation, and this for the use of apply_gradients. Google for more examples.

Cheers,
Raymond

Marcia_Ma · May 31, 2023, 7:09pm

Thank you so much Raymond, that is really helpful!

rmwkwok · May 31, 2023, 9:09pm

You are welcome @Marcia_Ma!

Topic		Replies	Views
What's the difference between "optimizer.minimize" and "tape.gradient and optimizer.apply_gradients(" Improving Deep Neural Networks: Hyperparameter tun	1	515	September 19, 2022
Use of apply method of optimizer Custom and Distributed Training with TF week-2	6	28	March 18, 2025
Week2 - Derivation for Update function for w(i+1) Neural Networks and Deep Learning week-2	8	227	January 21, 2024
Gradient Descent with Momentum-Last part Improving Deep Neural Networks: Hyperparameter tun week-2	2	31	November 27, 2024
Gradient Checking doubts Improving Deep Neural Networks: Hyperparameter tun	1	614	April 26, 2021

Question about apply_gradients

Related topics