If a square error cost function always has convexity property, which means that there is always one global minima and the gradient decent algorithm will always end up at the global minima, then it works perfectly well with a regression model with 1 independent variable. For example f(x) = wx + b.

But when we have a regression model that consists of multiple independent variables (more than 1) then the cost function will have local minima (more than 1 minima), which means that there will be non-convexity in the cost function.

Based on this, that a regression model with more than 1 variable causes non-convexity in the cost function, and a square error cost function will always have convexity property. How is it possible that the square error cost function is used for a regression model that has more than one independent variable? Intuitively it makes sense that it’s not possible, but Chatgpt says that it is possible, but I’m failing to understand its explanation.