Understanding of local optima in deep networks

ajallooe · April 28, 2023, 1:11am

Hi!

Does anybody know of a good resource that summarizes the state-of-the art understanding around this problem? A good article (academic or non-academic) or a survey paper maybe?

Thank you!
Mohammad

TMosh · April 28, 2023, 5:41am

Neural Networks have non-convex cost functions. They have local minima.
What sort of further paper are you looking for?

paulinpaloalto · April 28, 2023, 6:44am

Here’s an earlier thread on this general topic that includes a link to a paper from Yann LeCun’s group on cost surfaces. Please let us know if that looks like it’s relevant for your question.

ajallooe · April 28, 2023, 6:48am

Thank you both. I am referring to the subject of Andrew’s last lecture for week 2 of course 2. There was (and somewhat still is) this (huge) discussion around why neural nets get stuck in local minima and that the situation for finding good solutions was hopeless. Then, there came this discussion that actually most critical points in the cost surface are saddle points, not local minima, so the problem is not as severe as people were thinking. I was asking if there is a good new survey done which summarizes opinion on this matter.

ajallooe · April 28, 2023, 6:53am

The thread and paper from Yann LeCun’s group was what I was looking for. I was also trying to see if there are newer explanations of this kind.

Topic		Replies	Views
The Problem of Local Optima Lecture Improving Deep Neural Networks: Hyperparameter tun	1	606	April 23, 2022
Local Optima with Gradient Descent Improving Deep Neural Networks: Hyperparameter tun	1	551	May 30, 2021
Cost function stuck at local minima Neural Networks and Deep Learning	8	1416	July 5, 2024
Confused on Saddle Points Improving Deep Neural Networks: Hyperparameter tun	2	516	August 19, 2023
Cost function - How can we make sure that we end up in the global minimum and not one of the local minima Supervised ML: Regression and Classification week-2	2	815	December 3, 2022

Understanding of local optima in deep networks

Related topics