Week 2 lab notebook


In the notebook, it defined the function compute_gradient() first, and then in the later gradient_descent() function it used gradient_function. Shouldn’t this gradient_function function actually be called compute_gradient() function since we are using a function defined earlier?

HI @flyunicorn,

Here, compute_gradient is passed as an argument, and inside gradient_descent() , it is called as gradient_function(X, y, w, b). This design allows flexibility because you could swap in a different gradient computation function if needed. This makes the function more general-purpose, but technically it is possible to use the same name for both functions. Does it make sense?

To add to @conscell (nice name!)’ answer:

The idea is to be able to swap out the gradient function.

Suppose you have code to compute the gradient one way:

def compute_gradient_one_way(Y, y, w, b):
   ...text...
   ...text...
   ...text...
   return dj_db, dj_dw

and code to compute the gradient another way:

def compute_gradient_another_way(Y, y, w, b):
   ...text...
   ...text...
   ...text...
   return dj_db, dj_dw

Now take the function gradient_descent, which I here rename from gradient _descent to something more telling (finding good names is hard because a name may not help in looking at the thing the right way):

def do_gradient_descent_based_on_some_gradient_descent_function(
   x, y, w_in, b_in, some_cost_function, some_gradient_function, alpha, num_iters):
   ...text...
   ...text...
   ...text...
   # here we use whatever is meant by "some_gradient_function"
   # in this function to compute gradient:
   gradient = some_gradient_function(current_Y, current_y, current_w, current_b)
   ...text...
   ...text...

Then you can do gradient descent by passing the different gradient-computing-functions to the do_gradient_descent_based_on_some_gradient_descent_function() function.

Like this:

do_gradient_descent_based_on_some_gradient_descent_function(
   x, y, w_in, b_in, 
   some_cost_function, 
   compute_gradient_one_way, 
   alpha, num_iters)

and later maybe

do_gradient_descent_based_on_some_gradient_descent_function(
   x, y, w_in, b_in, 
   some_cost_function, 
   compute_gradient_another_way, 
   alpha, num_iters)
1 Like

Ah I see. We don’t see how gradient_function() is defined anywhere in the notebook as it is used directly like this: dj_db,dj_dw = gradient_function(X, y, w, b)

So it means it can be the same as compute_gradient() but under a different name. Or it can be a different way of computing gradient. Correct?

Yes.

In this case, as the function has been programmed-out and been named compute_gradient:

def compute_gradient(X ,y, w, b):
   instruction...
   instruction...
   instruction...
   return dj_db, dj_dw

One would just “pass the function” to gradient_descent by giving the name, and gradient_descent will call it using its local name as defined in the header, which is gradient_function.

We code this:

instruction...
# ~ ~ ~  ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ passing the name
# ~ ~ ~  ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ vvvvvvvv
gradient_descent(my_X, my_y, my_w_in, my_b_in, my_cost_function, compute_gradient, my_alpha, my_num_iters)
instruction...

P.S.

In programming languages that are not as happy-go-lucky as Python, what all of that means would be made explicit through appropriate type annotations and I think it is a great mistake to not teach Python type annotations to neophytes immediately, because, as the last 70 years have shown, typing is the basis of coding, but so it goes.

from typing import Callable, List

def map(f: Callable[[int], int], arr: List[int]) -> List[int]:
    return [f(x) for x in arr]

# Example usage:
def square(x: int) -> int:
    return x * x

numbers = [1, 2, 3, 4]
result = map(square, numbers)  # [1, 4, 9, 16]
print(result)
1 Like

Hi @flyunicorn ,

I just like to remove my last comment as I have misinterpreted it.