Understanding of local optima in deep networks

Hi!

Does anybody know of a good resource that summarizes the state-of-the art understanding around this problem? A good article (academic or non-academic) or a survey paper maybe?

Thank you!
Mohammad

Neural Networks have non-convex cost functions. They have local minima.
What sort of further paper are you looking for?

1 Like

Here’s an earlier thread on this general topic that includes a link to a paper from Yann LeCun’s group on cost surfaces. Please let us know if that looks like it’s relevant for your question.

Thank you both. I am referring to the subject of Andrew’s last lecture for week 2 of course 2. There was (and somewhat still is) this (huge) discussion around why neural nets get stuck in local minima and that the situation for finding good solutions was hopeless. Then, there came this discussion that actually most critical points in the cost surface are saddle points, not local minima, so the problem is not as severe as people were thinking. I was asking if there is a good new survey done which summarizes opinion on this matter.

The thread and paper from Yann LeCun’s group was what I was looking for. I was also trying to see if there are newer explanations of this kind.