Gradient Descent two local minima

sujirsachin · May 11, 2024, 4:58pm

In the video above, I didn’t understand if two local minima exists, how would the algorithm get to the global minimum?

TMosh · May 11, 2024, 5:25pm

There is no guarantee that gradient descent will find the global minimum.

That’s why it’s important to use convex cost functions whenever possible. They only have one minimum.

pastorsoto · May 11, 2024, 7:06pm

Hi @sujirsachin there are two parts of this question, the first part is local minimum which is the smallest value for an input of a function, so you take an specific range of values and the minimum value of that range is the local minimum, so you can have several local minimum. The second part is the global minimum which is the smallest value in the domain of the function (across all possible range of values of the function the lowest value)

Let’s evaluate an example

In the above picture you have some range of values in the x axis representing inputs and the y-axis the function outputs

Example:

Range between 1 to 5 you get the minimum value -1 (local minimum)
Range between 5 to 10 you get the minimum value 2 (another local minimum)
Range between 10 to 15 you get the minimum value -1.3 (another local minimum)

In this case the global minimum is -1.3 assuming the domain of the function to be x=1 to 15. This is the lowest value across all the intervals

When you want to optimize you can get “stuck” at a local minimum and never reach the global minimum, since you need to evaluate a wide range of values for the entire function to make it work.

Let me know if this helps!

sujirsachin · May 12, 2024, 7:15am

@pastorsoto Thank you so much for taking the time to write a detailed explanation.

One follow up question- So if we are “stuck” at the local minimum and never reach the global minimum, there’s no way we can reach the global minimum correct?

pastorsoto · May 12, 2024, 5:29pm

Yes, selecting the right parameters for optimizing your model could help to reach the global minimum, the thing is that you won’t know if you are at local or global minimum, that’s what @TMosh is suggesting when he pointed out that you need to use convex function

TMosh · May 12, 2024, 5:33pm

If you have to use a non-convex cost function (such as for training a neural network), then you can train multiple times using different initial weight values. Then compare all of the solutions and select the one with the lowest cost.

Sometimes you don’t need to find the global minimum - often any minimum that is “good enough” will do.

Topic		Replies	Views
C1_W1_Gradient-Descent Supervised ML: Regression and Classification week-1	3	570	July 28, 2022
Local minimum vs Global minimum in the context of Gradient Descent Supervised ML: Regression and Classification week-1	5	757	December 29, 2022
Gradient descent C1_W1 Supervised ML: Regression and Classification week-1	2	514	July 31, 2022
Query regarding local and global minima Supervised ML: Regression and Classification week-1	2	530	July 1, 2022
Convergence of Gradient Descent Supervised ML: Regression and Classification week-2	1	509	July 10, 2022

Gradient Descent two local minima

Related topics