Why is the number of iterations in gradient descent specified?

Steffen_Krogmann · March 19, 2023, 7:14pm

In the optional labs, gradient descent is implemented with a specific number of iterations and a for loop.

Why can’t we use a while loop and just stop when the optimum/minimal cost is not getting better anymore? Wouldn’t that lead to a better result?

Thanks!

paulinpaloalto · March 19, 2023, 7:32pm

Yes, it is a good observation there are more sophisticated ways to manage gradient descent, but note that it’s not quite so obvious as just quitting the first time the cost “ticks up”. It turns out that the convergence is not always monotonic. But this is just the very intro to all this and eventually we turn it over to “canned” packages like TensorFlow that have sophisticated internal implementations of all this.

Steffen_Krogmann · March 20, 2023, 8:54am

Thank you for the quick feedback. Looking forward to learning more!

When you say that convergence is not always monotonic, is that in relation models …
a) that are not just simple linear regression with one feature/variable, and
b) that do not have a squared error cost function?

I can’t imagine how the cost could “tick up” here unless we overshoot the optimum.

Thanks for your support! Much appreciated.

paulinpaloalto · March 20, 2023, 2:40pm

Sorry, I should make the disclaimer that I have not taken MLS and don’t know what is covered in MLS C1 Week 1. If it is linear regression with the squared error cost function or logistic regression with the cross entropy loss function, then in those cases the solution surfaces are convex, but it is still possible (as you say) to overshoot the minimum if you are using a fixed learning rate. I was speaking of the general case of Neural Networks in which the solution surfaces are no longer convex and the paths you can take are much more complex and can exhibit a lot more varied behavior.

Steffen_Krogmann · March 20, 2023, 7:05pm

Ah, thank you for the clarification. Now I understand.

Topic		Replies	Views
Regarding Gradient Descent Function Supervised ML: Regression and Classification week-1	6	507	January 24, 2023
Questions about week 1 content Supervised ML: Regression and Classification week-1	1	519	July 28, 2022
Repeat until convergence? Supervised ML: Regression and Classification week-1	4	545	January 12, 2023
W3/W4 assignments - num_iterations in lieu of true minima Neural Networks and Deep Learning	1	480	January 10, 2023
Supervised Machine Learning Optional lab: Gradient descent Question Supervised ML: Regression and Classification week-1	6	552	July 11, 2023

Why is the number of iterations in gradient descent specified?

Related topics