In gradient descent both for linear and logistic regression derivatives for `w_j`

and `b`

are different, when where is an `x_j[i]`

outside of the brackets for `w_j`

. So my question is why why do we have `x_j[i]`

multiplication for `w_j`

and do not have for `b`

?

That’s how the result becomes when you do the calculus for the partial derivative of the cost with respect to b.

Intuitively, if you look at `f_wb = w*x + b`

, you can see that ‘w’ is scaled by x, but ‘b’ is not. This is the basis for why dj_dw and dj_db have different forms.

1 Like

Thanks for your reply. Intuitively, I’ve also thought about the absence of `x`

in the `b`

part, however, after your reply I have confidence.