W1A2: I don't understand why the optimize function returns the grads

Riccardo_Andreoni · September 15, 2023, 2:42pm

I am missing something about how the code works.
The optimize() function returns the computed gradients. In my opinion, it should return the updated parameters.
In fact, when we call the optimize() function inside the model() function, we don’t use its output gradients.
How the parameters inside the model() function get updated, starting from the initial ones?
Of course the code is working, so I’m sure I am missing something out. But I can’t understand what it is

canxkoz · September 15, 2023, 6:29pm

Hello @Riccardo_Andreoni ,
Welcome back to the Discourse community! It has been a while since you have posted here. Thank you so much for coming back here to ask your questions. I am a Mentor and I will do my best to answer your question.

The optimize() function computes the gradients of the loss function with respect to the parameters. The gradients indicate how the parameters should be updated to minimize the loss.

The model() function actually updates the parameters using the gradients. It makes a small step in the opposite direction of the gradient to reduce the loss.

This process is repeated over many iterations (epochs) until the loss is minimized and the model is trained. So the parameters get updated from their initial random values to optimal values that minimize the loss function.

The optimize() and model() functions work together in this iterative process to train the model: optimize() computes gradients, model() updates parameters using gradients.

So in summary, the parameters get updated and learned inside the model using the gradients computed by optimize(). This gradient descent process slowly improves the model by minimizing the loss function.

I hope my step by step explanation of the optimization process clarifies your question. If you feel unsure about any of the steps that I wrote above or if you have a followup question, please feel free to reply to my response.
Regards,
Can Koz

Riccardo_Andreoni · September 15, 2023, 8:15pm

Thank you for your prompt reply. I understand the functioning principle that you describe, but I don’t understand how it is practically applied in the Python code.
Inside the model() function the variable parameters is fed as input to the optimize() function, which calls the update_parameters() function. My problem is that the optimize() function doesn’t output the updated parameter variable. Instead it outputs the gradients variable, which is no longer used after calling optimize():

It is a fact that the code still works, so the parameters inside the model() function are somehow updated. My hypothesis is that they are updated because the variable shares the same name both inside the model() function and inside the update_parameters() function.
I thought something like this would be correct:

curr_loss, parameters, a_prev = optimize(X, Y, a_prev, parameters, learning_rate = 0.01)

otherwise there is no need for calling the function update_parameters() inside optimize(), as it only outputs the updated variable parameters, but it is not returned as output by optimize()

def optimize(...)
  ...
  # Update parameters (≈1 line)
  parameters = update_parameters(parameters, gradients, learning_rate)
  return loss, gradients, a[len(X)-1]

I know this reply is quite contorted but I hope I was able to explain myself.
Thank you again for your explanation!

TMosh · September 16, 2023, 12:01am

See this image from the instructions:

In this assignent, “optimize” only works on one step, not the entire solution.

It’s maybe not a good name for the function in this context.

Topic		Replies	Views
Week1_Assignment2_Dinasaur_Names Sequence Models coursera-platform	1	527	September 27, 2021
W1 A2 Dinos Function Model - how are the parameters retrieved? Sequence Models coursera-platform	1	473	May 21, 2023
C5W1A2 optimize function - Parameters are not uptaded Sequence Models coursera-platform	2	532	May 13, 2022
Dinosaur Island model function - wrong output Sequence Models week-module-1 , coursera-platform	4	21	March 3, 2025
Week 3 Exercise 7 Neural Networks and Deep Learning coursera-platform	2	504	August 25, 2022

W1A2: I don't understand why the optimize function returns the grads

Related topics