Hi all,
Below is a slide of “Gradient descent for logistic regression.” I just don’t understand why the following formula has xj^(i). This might be beyond of the scope of our course, but could anyone explain that mathematically? Thanks!
Hi all,
Below is a slide of “Gradient descent for logistic regression.” I just don’t understand why the following formula has xj^(i). This might be beyond of the scope of our course, but could anyone explain that mathematically? Thanks!
It requires about two pages of calculus to understand why x_j appears in the gradients with respect to w.
Here is one presentation of the details:
Great, thanks TMosh!