Wanted Feedback on Possible Optimization Opportunity with Regards to Gradient Descent Algorithm

Abesh · June 18, 2025, 6:02pm

Hey Everyone,

I hope you all are doing well.

Yesterday I completed the Week 1 classes of the Supervised Machine learning module of the ML specialization course, and after learning about the gradient descent algorithm, an optimization technique came into my mind which might be able to solve the below limitation of the current algorithm of identifying Global minimum for a cost function which is non-convex in nature (not bowl shaped).

Please correct me if I am wrong, what I understood from the lecture is that the problem/limitation with the Gradient descent algorithm for non-convex functions is, once you reach a local minima, you will be stuck there and it might be the case that there exists a better local minima in the function.. Basically you might not be able to find the Global minimum.

So, I thought, what if we draw a horizontal line from the local minima which we initially found and check for the points that fall in the intersections of the function plot and the horizontal line. Those points would be then considered as the starting point for the next gradient descent. Once we do it for all, we can check the minimum value between them to identify the global minima or a better local minima.

The plot might look like this. (Sorry for the bad drawing)

The real world analogy might be like, going by the same example, which is given in the lecture - Imagine you’re hiking and you reach a local valley (local minimum).

You scan the horizon (horizontal line) and see another valley that’s at the same or lower altitude.

I just wanted to share this with you all to get your feedback.

TMosh · June 18, 2025, 6:29pm

That method might work.

But the limitation of local minima is usually addressed by only using convex cost functions.

Local minima is not a significant issue in practice, since convex cost functions exist for most situations.

And when a non-convex cost function is used, it turns out that often, any of the minima will give “good enough” performance.

We don’t necessarily need the absolute minimum cost. Any minimum that gives acceptable predictions is sufficient.

TMosh · June 18, 2025, 6:35pm

Another method for dealing with local minima is to train the model several times, using different random initial weight values. Then use the solution that settles to the lowest minimum.

Abesh · June 18, 2025, 7:06pm

Hey @TMosh , thanks for your feedback. Do you think from a performance perspective this might give better results since we are trying to get to the optimum weight via some sort of procedural way without relying on randomness?

TMosh · June 18, 2025, 7:23pm

Function minimization is all pretty much established art over the last 100 years.

This concept might work, but if it was an area for big gains, I believe it would already have been adopted decades ago.

Topic		Replies	Views
Gradient Descent two local minima Supervised ML: Regression and Classification week-module-1	5	200	May 12, 2024
C1_W1_Gradient-Descent Supervised ML: Regression and Classification week-module-1	3	577	July 28, 2022
Local minimum vs Global minimum in the context of Gradient Descent Supervised ML: Regression and Classification week-module-1	5	813	December 29, 2022
Cost function - How can we make sure that we end up in the global minimum and not one of the local minima Supervised ML: Regression and Classification week-module-2	2	872	December 3, 2022
Gradient descent C1_W1 Supervised ML: Regression and Classification week-module-1	2	527	July 31, 2022

Wanted Feedback on Possible Optimization Opportunity with Regards to Gradient Descent Algorithm

Related topics