In gradient descent, we’re subtracting the predicted value from the ground truth. How we’re subtracting from ground truth which was not available in cost function
1)The logistic cost does not use subtraction of the y value. It is a different cost function than was used for linear regression. (Updated) Logistic regression uses the product of ‘y’ and the log() of the sigmoid of the prediction.
2)However, the equation for the gradients is quite similar. It’s due to how the partial derivatives are computed.
I just want to add that having the “subtraction” term in the cost function isn’t a necessary condition to have the “subtraction” term in the gradient of the cost function. We need to be clear about this. I would also like to draw your attention to this post which will tell us that the “subtraction” term shows up after taking the derivative.
If you start from the equation for the logistic cost, then work through all of the steps in the calculus of the partial derivative of the logistic cost, you’ll then understand sentence 2) in my previous reply.