In the gradient_descent function, we print the history of the descent. However, I am struggling to see why the numbers printed are correct.
In any given iteration, dj_dw and dj_db are computed with w, b before being updated. J_history is computed after w, b are updated. Yet they are printed together.
I do not understand why the gradient before taking a step should be associated with the cost after taking a step, especially considering p_history is also using w, b values after taking a step. Did I misunderstand the code? Or something else.
Help appreciated, thank you.