About local minimum in NN

lil_xiang · July 5, 2024, 7:09pm

Hi,

I’m learning Machine Learning right now, since the bias and weights are randomly initialized at the begining of training Neural Network, is it possible that with gradient descent optimization, the cost function gets stuck at a local minimum instead of global minimum? I think this is not mentioned in the class?

Thanks

TMosh · July 5, 2024, 7:49pm

All NNs can find a local minimum. But it is not because of the initialization. It is because the NN cost function is not convex.

lil_xiang · July 5, 2024, 7:56pm

thanks, I understand that cost function is not convex, so in a unlucky case it got stuck in local minima, but how do we avoid this? train multiple times? cuz with gradient descent, I think it matters with initial weight and bias values(the starting point), and I feel like if we try multiples with random initialization, then it could be avoided, what’s the most efficient and effective way to avoid this issue?

TMosh · July 5, 2024, 8:02pm

Yes, can train multiple times. Often you do not really need the optimum minimum, just one that is good enough.

paulinpaloalto · July 5, 2024, 11:29pm

Exactly. Finding the overall global minimum is actually not what you want in any case, because it would very likely represent extreme overfitting on the training data. It turns out that for sufficiently complex networks, there is a band of local minimum values that are reasonable solutions and are numerous enough that you have a reasonable chance of finding them. This statement is based on work from Yann LeCun’s research group and the paper is linked from this other thread which discusses this same question.

lil_xiang · July 6, 2024, 9:25am

Thank you for you detailed explanation and the link to the other thread! Now I get it, I really appreciate it!

Sporego · July 7, 2024, 1:13am

This was a super helpful reply, thanks!

Topic		Replies	Views
C1_W1_Gradient-Descent Supervised ML: Regression and Classification week-1	3	571	July 28, 2022
Local optima in gradient descent Neural Networks and Deep Learning coursera-platform	2	639	March 13, 2022
Cost function - How can we make sure that we end up in the global minimum and not one of the local minima Supervised ML: Regression and Classification week-2	2	831	December 3, 2022
Local minimum vs Global minimum in the context of Gradient Descent Supervised ML: Regression and Classification week-1	5	759	December 29, 2022
Cost function stuck at local minima Neural Networks and Deep Learning coursera-platform	8	1450	July 5, 2024

About local minimum in NN

Related topics