C3W3 Lab RL why setting an adam optimizer while only updating gradients?

Francis60 · March 30, 2024, 10:02am

Hello,

I managed to get “C3_W3_A1_Assignment.ipynb” to work locally.

I read the lesson again and the code. And I wonder why setting a specific adam optimizer while only updating gradients afterward?
Any optimizer would not work as well as far as we don’t run any optimization with it in the whole code?
Or maybe I missed something hidden in the other files.
Or something I missunderstood?

Best regards,

Francis

rmwkwok · March 30, 2024, 1:24pm

Hey @Francis60,

The Adam optimizer did its part in the agent_learn function when we called its apply_gradient method. If we didn’t have this line of code, no weight update would have happened to the Q network.

Cheers,
Raymond

Francis60 · March 30, 2024, 5:04pm

I’m not sure whether I fully detailed my question.

In the code I don’t see any of those:

model.compile(
model.fit(

to explicitly call the model adjustment.

I tried with a different optimizer (basic gradient descent):

optimizer = SGD(learning_rate=ALPHA)
# optimizer = Adam(learning_rate=ALPHA)

SGD : solved in 913 episodes! Total Runtime: 20.61 min
Adam: solved in 596 episodes! Total Runtime: 14.17 min
(I have a 16-gig cpu)

So what I understand is that this:

optimizer.apply_gradients(zip(gradients, q_network.trainable_variables))

is actually calling the optimizer and then the model performs all Adam refinements each time it runs and is not only strictly applying the computed gradients.

Is it?

Francis60 · March 30, 2024, 5:26pm

I asked ChatGPT 3.5, it answers this:

If model.compile() or model.fit() are not executed, the model is not trained, and therefore, the optimization process, including adaptation, does not occur. The optimizer’s parameters remain unchanged until the model is compiled and trained.

And also that:

Applying Gradients : After computing the gradients, you call the optimizer’s apply_gradients() method, passing it the gradients along with the corresponding model parameters. The optimizer then applies these gradients to update the model’s parameters according to its optimization algorithm.
when optimizer.apply_gradients() is called, the optimizer refines not only the learning rates but also any other parameters it maintains, such as momentums.

So ‘apply_gradients()’ does not only apply gradients straight away but using optimizer full capabilities and keeps a track of previous steps.

Where is the ‘resolved’ button ?

rmwkwok · March 31, 2024, 12:44am

Hello @Francis60,

Yes. adam.apply_gradient does apply the gradients in the manner of Adam. We utilize our optimizer this way when we have a customized training process - getting new (s, a, r, s) tuples and soft-updating Target Q Network. These customized processes are not in the standard training procedure initiated by model.compile and model.fit.

There should be a “Solution” button but somehow this and some of the other buttons just don’t show up in some of the threads… It’s quite mysterious to me too

Cheers,
Raymond

TMosh · March 31, 2024, 5:45am

I recommend you not use a chat tool for programming advice.

Francis60 · March 31, 2024, 10:20am

Okay @TMosh , and what about coursera coach (Beta) ?

Cheers,
Francis

TMosh · March 31, 2024, 5:28pm

I’m not aware of that tool.

If you are enrolled in a programming course (like MLS), you should be very careful to do your own work. That’s covered in the Code of Conduct.

Francis60 · March 31, 2024, 5:51pm

This is the tool Coursera has been providing for the past few months.

Of course, I do my own work.

Regards,

TMosh · March 31, 2024, 5:57pm

That’s good to know.

Francis60 · April 1, 2024, 5:26pm

e.g.:

Explain this topic in simple terms
Give me practice questions
Give me a summary
Give me real-life examples

Topic		Replies	Views
Why use GradientTape in the DQN practice lab? Unsupervised Learning, Recommenders, Reinforcement week-module-3	10	485	May 23, 2023
Use of apply method of optimizer Custom and Distributed Training with TF week-module-2	6	44	March 18, 2025
DLS Course 2, wk3 programming assignment optimizer for "Train the Model" Improving Deep Neural Networks: Hyperparameter tun coursera-platform	1	551	July 20, 2021
C3W1_Assignment Excercise 4, Training model, i get this error NLP with Sequence Models week-module-1	10	50	October 5, 2024
Week2 Assignment Adam Improving Deep Neural Networks: Hyperparameter tun coursera-platform	1	504	July 28, 2022

C3W3 Lab RL why setting an adam optimizer while only updating gradients?

Related topics