Why the derivative of the cost functions used in gradient descent is the same for linear and logistic regression, while the costs function that are derived originally different?

IbrahimNasr · November 16, 2024, 8:56pm

In the first course of ml specialization (week #3) for the gradient descent application in the logistic regression, the derivative of the cost function is exactly the same like the one used for linear regression. How the derivative of two totally different cost function ended with exactly the same d/dj used in the gradient descent?

gent.spah · November 17, 2024, 6:54am

The similarity in the form of the derivatives arises because both linear and logistic regression involve linear combinations of the input features, (X\theta), in their hypothesis functions. The key difference is in the transformation applied to this linear combination:

Linear Regression: Uses the identity function (i.e., no transformation).
Logistic Regression: Uses the logistic (sigmoid) function.

Despite the different transformations, when taking the derivative with respect to (Theta), the chain rule results in expressions that involve the difference between the predicted values and the actual values, scaled by the input features (X). This leads to similar-looking gradient expressions, even though the cost functions themselves are different.

In summary, the derivatives appear similar because they fundamentally represent the gradient of the error with respect to the model parameters, which in both cases involves the difference between predicted and actual values, weighted by the input features.

TMosh · November 17, 2024, 6:59am

It’s a happy coincidence of the partial derivatives for the two cost functions. Keep in mind that one uses the sum of the squares of a linear function, and the other is a logarithmic function that includes includes the sigmoid() function.

The math just turns out that the partial derivatives look rather similar.

Topic		Replies	Views
Why isn't the actual loss function for logistic regression not put in place of cost function while implementing gradient descent? Shouldn't the cost function containing the log function be partially differentiated? Supervised ML: Regression and Classification week-module-3	9	896	October 10, 2022
Gradient descent for Logistic Regression : Same partial derivative. Why? Supervised ML: Regression and Classification week-module-3	13	1166	September 11, 2022
Why the linear regression and classification have identical Gradient Function? Supervised ML: Regression and Classification week-module-2	5	707	February 11, 2023
Why gradients are same? Supervised ML: Regression and Classification week-module-3	1	255	March 26, 2024
Derivative of the Cost Function in Logistic Regression Supervised ML: Regression and Classification week-module-3	2	524	August 24, 2023

Why the derivative of the cost functions used in gradient descent is the same for linear and logistic regression, while the costs function that are derived originally different?

Related topics