Ok, so I have worked out an answer to my question, and I’ll post a walkthrough here to help explain what I was hoping to achieve. I understand this is not broadly applicable. Or if I am even mathematically sound here.

**This is an exploration, not a suggestion.**

Let’s tart with Dr. Ng’s example in Cost Function Intuition, but with a different training set.

Given the two-dimensional training data set, (1, 2), (2, 3) and (3, 1)

And given the constraint that our line of best fit will have a **b** or y-intercept of **0**

And given the simplified cost function of *mean squared error*, with just parameter **w** (no b)

Determine the optimum **w** that minimizes **J(w)**, *by completing a quadratic equation and computing the vertex*.

But first, let’s do what Dr. Ng does to build some intuition. Let’s consider the lines for three contrived values of **w**, and compute the cost, **J(w)**.

Next, let’s visualize these three contrived values of **w** against their cost, **J(w)**. It might look something like this:

“But wait!” I say, “If you show me a parabola, I know we can find its vertex, if we have enough information, such as the general quadratic form of the parabola, or the vertex form of the parabola.” In this contrived, simplified J(w) mean squared error situation, we can indeed come up with that form of the parabola.

We can expand m, x and y using the values in the training set, to express J(w) as a quadratic equation.

Given we have the general form, we identify a as 14/3 and b as 22/3. We can then compute the vertex with **-b/2a**:

This informs us that **w ~= 0.79** should give us the approximate minimum of **J(w)**. If we compute **J(0.79)**, we get approximately **1.78**.

That feels about right…

Let’s graph the line and compute the mean squared error:

So, in this simplified scenario, where we compute the mean squared error with the constraint b = 0, we can indeed compute the optimal **w** by expanding the cost function with each value of x and y in the training set, simplifying the factors into an approachable form (eg general quadratic or vertex form) and directly computing the vertex (eg -b/2a).