C1 w2 ML specialization - Option Lab - Multi Variable Linear Regression

What is the purpose of the 2nd plot with tail in it and what does it show? I am unable to understand how for the same iteration steps the cost vs iteration graph becomes a straight line from a L shaped curve and how is this helping us understand the predictions?

Can someone please explain? Thank you!

Hi @GodRishUniverse

The first plot seems to show that the cost cease to improve, so we have the second plot that shows the cost is indeed still decreasing over the iterations. The second plot is a zoom-in of the first. See the difference in the y-scale of the two plots - the second spans a much smaller range.


1 Like

Thank you for your reply @rmwkwok !

I didn’t quite get you. How can there be 2 points on the the same set of graph. If you say that the 2nd plot is the zoomed up version of the first then from what I can see is that the iteration value at 0 is very high, but that does not seem to be reflected in the 2nd plot.


The x-scale on the 2nd plot isn’t clearly showing it, but x=0 isn’t covered there.



if you read this part of the code, it says the x starts from 100, not from 0, so x=0 isn’t covered in the 2nd plot.

ohhhh ok I get it now. Thank you so much, Raymond!

Btw these cost vs iteration graphs show us how efficient the algorithm is right? Like in how many iterations does it converge?

Not quite. I believe it showed you just an example of how the cost curve could evolve over iterations. It is just one example, and it is not an example of any representative power.

The number of iterations for convergence differs from case to case and whatever numbers you see here or in the rest of the course, we couldn’t expect the same to happen in our own future machine learning projects.

As a learner myself, I don’t take things for granted easily. If somebody told me their model had converged in 10 iterations, then I would say “OK”, and then saw how my own projects would go before concluding if “10” is a number worthy to remember. Probably only when cases out of my reach (such as training a 100billion parameters model because I don’t have that resources), I would keep in mind the numbers others told me. However, once I am in the position to train such a large model, those numbers I have remembered will be challenged. :smiley:


I didn’t quite get what you meant here…(above)

I understood the rest. Thank you so much, Raymond! I really appreciate you for helping me.

Nevermind :wink: . It was just a description of how I would come up with any rule of thumb of my own, and may not be useful for everyone anyway.

Happy learning! :smiley:


Ah ok. Thank you so much Raymond!