Linear regression problem related to local maxima

smhamidi · June 20, 2022, 6:06am

what happens if we want to use linear regression and for the starting point, we accidentally choose local maxima. in this point the slope of the function is also zero like local minima and our algorithm doesn’t work for the same reason that it eventually ends.

rmwkwok · June 20, 2022, 6:26am

There is no local maximum for the combination of linear regression model and squared loss, because this combination gives you a convex parameter space.

For other combination, it is possible, and in that case, given that the update formula being w := w - \alpha\frac{\partial{J}}{\partial{w}}, then yes, there will be no update to occur. However, such chance is very rare for randomly initialized parameters to give you exactly the local maximum.

shanup · June 20, 2022, 8:18am

As @rmwkwok mentioned this would be a very rare occurence. However, if by any unbelievable stroke of luck this were to happen right at the starting point, there is nothing stopping us from going back and re-initializing the weights with another set of random numbers and then get going with the Learning algorithm.

paulinpaloalto · June 20, 2022, 6:00pm

Right! MSE (mean squared error) loss is just a variation of Euclidean Distance. There is no such thing as the maximum distance between two points in \mathbb{R}^n: you have (literally) infinitely much room, so you can always move further away from the correct answer. As Raymond says, the loss function is convex in that case, meaning it looks like the multidimensional analog of an upward opening parabola.

But we will soon switch to using Neural Networks and there the cost functions are no longer convex, so you can encounter local maxima and “saddle points” where the gradients are 0. So Shanup’s answer will be the saving grace once we get to that more complex situation. We can always try again with different random initializations. The probability of exactly hitting a gradient of zero is also extremely small. As long as the gradient is not exactly zero, you’ll be able to move in some better direction, even if it takes a while to escape from the relatively flat area around the local optimum or saddle point.

Topic		Replies	Views
Gradient descent local maximum Supervised ML: Regression and Classification week-module-1	2	532	July 7, 2022
Gradient descent fails at local maximum initial values? Supervised ML: Regression and Classification week-module-1	2	603	June 26, 2022
What if we get local maxima when we choose w, b in gradient descent algorithm Supervised ML: Regression and Classification week-module-1	4	556	January 20, 2025
Gradient Descent local max Supervised ML: Regression and Classification week-module-1	5	549	May 11, 2023
Doubt regarding a potential limitation of gradient descent Supervised ML: Regression and Classification week-module-1	5	133	June 10, 2024

Linear regression problem related to local maxima

Related topics