Gradient descent for logistic regression (xj^(i).)

Hi all,

Below is a slide of “Gradient descent for logistic regression.” I just don’t understand why the following formula has xj^(i). This might be beyond of the scope of our course, but could anyone explain that mathematically? Thanks!

It requires about two pages of calculus to understand why x_j appears in the gradients with respect to w.

Here is one presentation of the details:


Great, thanks TMosh!