CycleGAN: Why does LeastSquaresLost work here (and not everywhere)

paulinpaloalto · June 8, 2024, 12:50am

There are a lot of questions there. Maybe I will need to take a “divide and conquer” approach rather than create one huge answer. So let me parse things into subtopics.

I don’t remember anywhere in DLS where Prof Ng says anything that could be interpreted that way. The more parameters you have, the more complex your solution surfaces are and the more local minima you will have. There’s never any hope that you will find a solution that is not a local minimum. In fact, finding the absolute minimum would probably represent extreme overfitting in any case. But it has been shown that for sufficiently complex problems, there is a band of local minima which are very likely to be found in gradient descent which are actually reasonable solutions. So what he does say is that it turns out in real solutions that the “local minimum” issue is not that big a deal.

Here’s a thread which talks about the work from Yann LeCun’s group that discusses the math showing that local minima are not really a problem and it also links to a thread which deals with the huge number of local minima created by weight space symmetry.

Yes, this is a good point. Sorry, my example is not really that relevant.

Topic		Replies	Views
Question about loss in GAN Apply Generative Adversarial Networks week-3	1	266	April 9, 2024
Problem with BCE loss video question Build Basic Generative Adversarial Networks week-3	18	552	February 27, 2022
Why no wasserstein loss in cGANS? Build Basic Generative Adversarial Networks week-1	1	528	January 21, 2022
How does bce loss induce vanishing gradients in GANs Build Basic Generative Adversarial Networks week-3 , ai-discussions , generative-ai	5	84	October 4, 2024
Questions regarding C1W1 assignment Build Basic Generative Adversarial Networks week-1	8	822	November 18, 2021

CycleGAN: Why does LeastSquaresLost work here (and not everywhere)

Related topics