But you already made the key point in your previous post:
The J value is trashed, but gradient descent still works just fine. The only real use for J is as an inexpensive proxy for how your convergence is going.
These points were also included in that thread I linked above, but in a later post on that same thread.