Understanding Cost function vs Gradient descent similarities


I’m struggling to come to terms here, pun intended. Specifically, J(w,b).

For the Cost function, it’s defined as one thing which includes being multiplied by (1/2m):

For Gradient Descent, it’s defined as another thing, which instead is multiplied by (1/m):

As I’m typing this out I’m realizing that the latter is NOT actually J(w,b) but rather dJ(w,b)/dw. Now I presume that multiplying J(w,b) and d/dw somehow yields the gradient descent formula from the cost function. Admittedly, I’m not entirely sure what I’m saying and greatly appreciate anyone taking the time to help me understand. I also realize understanding this may be out of the scope of my math knowledge and it’s just something I’ll have to accept.

It’s a calculus thing. The gradients are the partial derivative of the cost equation.

1 Like