Hello, while doing the lab for Gradient descent, I have found something that I can’t understand on how the function is implemented gradient_descent(x, y, w_in, b_in, alpha, num_iters, cost_function, gradient_function)
.
If I undertood well the w_final
and b_final
returned are the the w
and b
calculated at the iteration number 10000 that’s mean the latest iteration, that I think it’s not correct, because it’s not sure that the last iteration is returning the minimum cost value calculated between all the iterations (I am reading the code wrongly ? ), I think we should save the w_final
and b_final
depending on the minimum cost calculated, and return them at the iteration number 10000.
Could someone please explain me that ? Please find below the function
def gradient_descent(x, y, w_in, b_in, alpha, num_iters, cost_function, gradient_function):
# An array to store cost J and w's at each iteration primarily for graphing later
J_history = []
p_history = []
b = b_in
w = w_in
for i in range(num_iters):
# Calculate the gradient and update the parameters using gradient_function
dj_dw, dj_db = gradient_function(x, y, w , b)
# Update Parameters using equation (3) above
b = b - alpha * dj_db
w = w - alpha * dj_dw
# Save cost J at each iteration
if i<100000: # prevent resource exhaustion
J_history.append( cost_function(x, y, w , b))
p_history.append([w,b])
# Print cost every at intervals 10 times or as many iterations if < 10
if i% math.ceil(num_iters/10) == 0:
print(f"Iteration {i:4}: Cost {J_history[-1]:0.2e} ",
f"dj_dw: {dj_dw: 0.3e}, dj_db: {dj_db: 0.3e} ",
f"w: {w: 0.3e}, b:{b: 0.5e}")
return w, b, J_history, p_history #return w and J,w history for graphing