What is the purpose of the 2nd plot with tail in it and what does it show? I am unable to understand how for the same iteration steps the cost vs iteration graph becomes a straight line from a L shaped curve and how is this helping us understand the predictions?
The first plot seems to show that the cost cease to improve, so we have the second plot that shows the cost is indeed still decreasing over the iterations. The second plot is a zoom-in of the first. See the difference in the y-scale of the two plots - the second spans a much smaller range.
I didn’t quite get you. How can there be 2 points on the the same set of graph. If you say that the 2nd plot is the zoomed up version of the first then from what I can see is that the iteration value at 0 is very high, but that does not seem to be reflected in the 2nd plot.
Not quite. I believe it showed you just an example of how the cost curve could evolve over iterations. It is just one example, and it is not an example of any representative power.
The number of iterations for convergence differs from case to case and whatever numbers you see here or in the rest of the course, we couldn’t expect the same to happen in our own future machine learning projects.
As a learner myself, I don’t take things for granted easily. If somebody told me their model had converged in 10 iterations, then I would say “OK”, and then saw how my own projects would go before concluding if “10” is a number worthy to remember. Probably only when cases out of my reach (such as training a 100billion parameters model because I don’t have that resources), I would keep in mind the numbers others told me. However, once I am in the position to train such a large model, those numbers I have remembered will be challenged.