Local optima in gradient descent

SHAHID_KAMAL · March 13, 2022, 5:37pm

what will happen if the cost function has many local optima or it is not a convex function then what will the gradient descent do ? will it be stuck at one local optima or will it reach global optima ?

arghya · March 13, 2022, 7:46pm

The solution will depend on the choice of initial solution and learning rate. For a sufficiently small learning rate, GD will stuck into a local optima depending on the initial soluton.

paulinpaloalto · March 13, 2022, 8:06pm

It’s an interesting and important question and the answers are not straightforward:

The cost functions for neural networks are not convex. There is no guarantee that you’ll ever find a global minimum, but that may not be desirable in any case: most likely it would represent extreme overfitting on the training data. If you choose your gradient descent algorithm and parameters correctly, you can usually find one of the very many local minima that give a reasonable solution.

Here’s a thread which discusses this in more detail.

Topic		Replies	Views
Cost function - How can we make sure that we end up in the global minimum and not one of the local minima Supervised ML: Regression and Classification week-module-2	2	833	December 3, 2022
Local Optima with Gradient Descent Improving Deep Neural Networks: Hyperparameter tun coursera-platform	1	553	May 30, 2021
Cost function stuck at local minima Neural Networks and Deep Learning coursera-platform	8	1462	July 5, 2024
Local minimum vs Global minimum in the context of Gradient Descent Supervised ML: Regression and Classification week-module-1	5	762	December 29, 2022
About local minimum in NN Advanced Learning Algorithms	6	64	July 7, 2024

Local optima in gradient descent

Related topics