# Repeat until convergence?

I must be missing something really obvious, but please bear with me…
At the end of week 1 we we calculate gradient descent as: repeat (decreasing w and b) until convergence.
I understand what it means, and the visualizations are very helpful, but i don’t see in the code how we test for convergence. Shouldn’t we be comparing updated cost function J (w, b) with the previous cost function J (w,b) to ensure it is still decreasing and we did not overstep the minimum? I see in the code we stop computing gradient descent when a fixed number of iterations is reached - but that number is arbitrary.
We have to be able to do it programatically?
thank you!

Since this is Week 1 of an introduction course, we don’t actually test for convergence in the code. It’s done visually from the cost history plot.

Hello @Svetlana_Verthein,

Great observation! For your information, that’s called “Early Stopping” and it isn’t covered in this course, however, the idea is just like what you have suggested. In particular, we want to compare the cost based on the cv set so that we will “early stop” when at least that the cv cost stops improving. Tensorflow implements this (here is the link), and I think you want to have a look at the list of parameters you can set, such as `monitor`, `min_delta`, and `patience`.

Cheers,
Raymond

Thank you, Raymond! This is very helpful. I’m in Week 2 now, and now I understand why convergence cannot be tested as simply comparing the new cost J to the previous cost of J (I thought that was all that was needed!)
It’s because if new J starts increasing, it may mean either: a) J minimum has been achieved or b) alpha is too large (or there is a bug in the code) - correct? Two very different scenarios.
Looking into Tensorflow EarlyStopping now - thanks for the link!