Week 1 - Gradient Function explanation

Link to the classroom item you are referring to:

Hello Mentors,

I am working on the optional lab 04 of the week 1. I am going through the following notebook.
C1_W1_Lab04_Gradient_Descent_Soln.

I am trying to understand the gradient_function(x, y, w , b). Can someone share the location or the file where the gradient_function() is defined?

“gradient_function” is the name of one of the function arguments.

It’s a pointer to whatever function is used when the code in your image is called.

Could you elaborate a little more?

I am unable understand how without defining the gradient_function you called it.

See this bit of code from the lab:

When gradient_descent() is called it passes “compute_gradient” as the name of the function.

The definition of the gradient_descent() function uses the 8th argument as the “gradient_function”.

1 Like

I missed that part. I should have observed it better. Thank you so much for explaining with screenshots. It was very helpful.

1 Like

Hi, I had the same question as tamalmallick. I think based on your response I am starting to understand that the final call of gradient_descent( ) does use the def functions compute_cost and compute_gradient.

Why use the pointers cost_function and gradient_function when defining gradient_descent? Couldn’t the lab have inputed compute_cost and compute_gradient directly?

Yes, they could have just called those functions directly, but they are just showing you a way to write more flexible “general” code in python. In python you can pass references to functions as parameters to your functions. So they have written a general gradient_descent function that can be used with different cost functions.

For example, suppose that you were considering two different cost functions for a given problem. With the code they wrote, you would only have to write the gradient descent logic once and then you could try both cost functions and compare the results.

Of course note that the compute_gradient function will be paired with the compute_cost function. If you use a different cost function, then the gradient values will be driven by that algorithm.

That context is really helpful, thank you for clarifying!

For me it was also rather confusing at the beginning. I know it´s more general, but back then I need something more understandable. Here is some code I simplify for myself for better understanding (and I also remove J, it´s nice to see but I am a beginner here, so the smaller the code is that does something, the more understandable it is)

I admire your help, but I must delete the code from your reply, because sharing your code for a graded assignment is not allowed.