Linear regression cost function 3d graph

From looking at the convex shape of the linear regression cost function, it shows that for any value of w, changing the value of b to the optimal value of b(value of b at the global minimum) would decrease the cost. However I can think of many cases where this would not be true, for example:

Training Examples:
x1 = 5, y1 = 25
x2 = 10, y2 = 50

The values for w and b where the cost would be the lowest is w = 5, b = 0.
If the value of w and b is 0, the cost is 75 for the two training examples, but if we keep w the same and bring the value of b up to 20, away from its optimal value, the cost would be 35, which does not follow what the graph displayed in the videos shows.
Can anyone explain how this works?

Hi @adam_wilkes

The optimal b for a given w minimizes the cost function along the b -dimension but doesn’t guarantee it reaches the global minimum of the whole function unless w is also optimal.

Hope this answers your question! Feel free to ask if you need further help.

You have to optimize both w and b simultaneously.


I meant using this image, it shows that changing the values of b away from what would be the optimal value of b(in this case 0) always increases the value cost function, no matter the value of w, however with the example I provided earlier I was confused as to how that could be possible, as changing the value of b away from its optimal value decreased the cost in that scenario. Hope my question is more clear.

I do not think that is true. The curve is bowl shaped - starting at the minimum, every direction you move along the curve increases the cost value.

And from every point on the curve, there is a direct path (down the gradients) that leads to the minimum cost.

@adam_wilkes, it may be easier to see using a contour map. For example, in this contour map:

Screenshot 2024-08-01 at 6.25.17 PM

you can see that for a given w, the value of b that gives the smallest cost is a different b than the value of b at the overall minimum for the whole function. The shape is still a bowl, but the minimum cost for the part of the bowl you’re slicing through at any given w will not necessarily be at the same b as it is at the global optimum.

You can play around with the optional cost function lab to experiment with the graphs to see how this can vary with different values.

2 Likes