From looking at the convex shape of the linear regression cost function, it shows that for any value of w, changing the value of b to the optimal value of b(value of b at the global minimum) would decrease the cost. However I can think of many cases where this would not be true, for example:
The values for w and b where the cost would be the lowest is w = 5, b = 0.
If the value of w and b is 0, the cost is 75 for the two training examples, but if we keep w the same and bring the value of b up to 20, away from its optimal value, the cost would be 35, which does not follow what the graph displayed in the videos shows.
Can anyone explain how this works?
The optimal b for a given w minimizes the cost function along the b -dimension but doesn’t guarantee it reaches the global minimum of the whole function unless w is also optimal.
Hope this answers your question! Feel free to ask if you need further help.
I meant using this image, it shows that changing the values of b away from what would be the optimal value of b(in this case 0) always increases the value cost function, no matter the value of w, however with the example I provided earlier I was confused as to how that could be possible, as changing the value of b away from its optimal value decreased the cost in that scenario. Hope my question is more clear.
@adam_wilkes, it may be easier to see using a contour map. For example, in this contour map:
you can see that for a given w, the value of b that gives the smallest cost is a different b than the value of b at the overall minimum for the whole function. The shape is still a bowl, but the minimum cost for the part of the bowl you’re slicing through at any given w will not necessarily be at the same b as it is at the global optimum.
You can play around with the optional cost function lab to experiment with the graphs to see how this can vary with different values.