Hello everybody, my question is regarding the comment underlining the two curves. I don’t understand why the curves took this strange shape.

Welcome to our community @Omar_Mohamad ! Why is it strange? How would you expect it to look like?

Raymond

PS: Note the right hand side plot is a zoom-in plot. Compare the x-axes of the two plots.

What the left plot shows is that the cost rapidly decreases to a relatively stable value after the first few iterations.

The right plot shows that after very many iterations, the cost is still decreasing but very slowly.

@rmwkwok - @TMosh

Thanks for replies. I want to make sure that I got it, does this rapid change in cost happen because of using unnormalized features ? and if so, I cannot link between the absence of normalization and this response of the curve.

Hello @Omar_Mohamad, if you check the costs, it was very high at the beginning and the initial parameters w and b must matter and they could just be just too far away from the optimum. If we had deliberately changed the initial values to even farther away, the drop would have been even more dramatic. Normally we don’t initialize w to be zeros, but some random numbers, and that could be a reason why this cost plot is so different.

Feature normalization plays a part, because at the end of the day it affects the size of the optimal w and b which will bring us back to the first paragraph - how close the optimum is to w=0 and b=0. but I do not want to analyze it and see exactly how large that part is, unless you are interested in and you may normalize the features and compare the new cost plot with the current one.

Raymond