Another thing worth saying here is that, before you get too far down this rabbit hole, please realize that Linear Regression is by far the simplest problem mathematically that you will see here. It’s actually solvable in closed form! As soon as you graduate to a neural network with more than one layer, you can say “bye bye” to convexity and closed form solvability. It’s not unusual to see neural networks with hundreds of layers and millions of parameters. So the solution surfaces are non-convex and embedded in \mathbb{R}^n for some very large value of n.
Here’s a paper from Yann LeCun’s group which talks about solution surfaces for neural networks. Here’s a thread about Weight Space Symmetry and the number of potential local optima that is more food for thought.