Understanding Cost function vs Gradient descent similarities

Hello!

I’m struggling to come to terms here, pun intended. Specifically, J(w,b).

For the Cost function, it’s defined as one thing which includes being multiplied by (1/2m):
image

For Gradient Descent, it’s defined as another thing, which instead is multiplied by (1/m):
image

As I’m typing this out I’m realizing that the latter is NOT actually J(w,b) but rather dJ(w,b)/dw. Now I presume that multiplying J(w,b) and d/dw somehow yields the gradient descent formula from the cost function. Admittedly, I’m not entirely sure what I’m saying and greatly appreciate anyone taking the time to help me understand. I also realize understanding this may be out of the scope of my math knowledge and it’s just something I’ll have to accept.

It’s a calculus thing. The gradients are the partial derivative of the cost equation.

1 Like