Why does derivatives for w_j in gradient descent differ from b?

In gradient descent both for linear and logistic regression derivatives for w_j and b are different, when where is an x_j[i] outside of the brackets for w_j. So my question is why why do we have x_j[i] multiplication for w_j and do not have for b?

That’s how the result becomes when you do the calculus for the partial derivative of the cost with respect to b.

Intuitively, if you look at f_wb = w*x + b, you can see that ‘w’ is scaled by x, but ‘b’ is not. This is the basis for why dj_dw and dj_db have different forms.

1 Like

Thanks for your reply. Intuitively, I’ve also thought about the absence of x in the b part, however, after your reply I have confidence.