Gradient descent fails at local maximum initial values?

Yuri_Dulkin · June 24, 2022, 6:08pm

Hi,

I am posting for the first time, but I hope this observation will be interesting for you all.

Watching through the videos on the Gradient Descent algorithm, and specifically its end state, an observation came to my mind:

If we accidentally choose our initial w(i) and b values such that we land on a local maximum in the cost function plot, the algorithm would fail - it would get “zero” for each derivative and get stuck in place, right?

Naturally, I’m not talking about linear regression, because it should be a more complex model (otherwise the cost function would be bowl-shaped and not have any maximum).

I guess there’s a trick to avoiding this problem ?
I was thinking about some random step in any direction, just to get the algorithm rolling?

Appreciate your thoughts,
Yuri.

tharunnayak14 · June 25, 2022, 12:04am

Hi @Yuri_Dulkin

If w and b are at local maxima, we get stuck at that point as the gradient is zero.

That’s why we use techniques like random initialization of w, which makes encountering local maxima highly unlikely.

And another point is that in most machine learning algorithms, we have lots of features.

Suppose a model has 50 features, so its w would be 50 dimensional, and for a point to be a local maximum, it has to be local maxima in all the 50 dimensions, which is highly unlikely to happen.

In reality, a machine learning model is improbable to encounter local maxima or local minima; mostly, they would be stuck at saddle points or plateaus where the derivative stays close to 0 for a long time, making training slow.

TMosh · June 26, 2022, 5:10pm

Also, we prefer to use cost functions that are convex, so there are no local maxima. All the gradients will point “downhill” toward the global minimum.

Topic		Replies	Views
Gradient descent local maximum Supervised ML: Regression and Classification week-1	2	506	July 7, 2022
What if we get local maxima when we choose w, b in gradient descent algorithm Supervised ML: Regression and Classification week-1	4	528	January 20, 2025
Doubt regarding a potential limitation of gradient descent Supervised ML: Regression and Classification week-1	5	100	June 10, 2024
C1_W1_Gradient-Descent Supervised ML: Regression and Classification week-1	3	570	July 28, 2022
Gradient Descent starting at the "top of the hill" Supervised ML: Regression and Classification week-1	2	540	June 18, 2022

Gradient descent fails at local maximum initial values?

Related topics