Identifying convergence in code

I am reviewing C1_W2_Lab02_Multiple_Variable_Soln notebook.

There is a function, gradient_descent that takes 1000 steps and decides this is good enough for selecting parameter values.

Per the course, it seems there are two ways to come to this conclusion:

  1. Use a learning curve and visually choose a number of iterations (and alpha) that get you the best result
  2. Do automatic convergence test where \epsilon = 10^{-3} where if J(\vec{w}, b) decreases by less than \epsilon then you consider convergence found

How was this 1000 derermined? Is it just an example?

It’s just an example.

A cost history plot is useful for eyeballing the convergence.

The trick about using an epsilon value is you really don’t know what epsilon might be good enough. So you still have to experiment.

Don’t forget that ‘the best result’ cannot be determined by the training cost curve alone, but needs to incorporate generalization performance as well. You want to train to the minimum of the solid line below, not just training (blue dots) or validation (green dots)


Graph pulled off the interweb here: Early Stopping Definition | DeepAI

Thanks for the insight here. I am only on week 2 and this has not been mentioned yet.

This topic is covered later in the course.