Some fun graphs derived from the Week 2 Programming Assignment

But you already made the key point in your previous post:

The J value is trashed, but gradient descent still works just fine. The only real use for J is as an inexpensive proxy for how your convergence is going.

These points were also included in that thread I linked above, but in a later post on that same thread.

2 Likes