Derivative of regularized logistic cost function-- does it need the DIMENSION of the w vector?

conscell · December 18, 2024, 3:37pm

I believe there is a mistake in your derivation. For the regularization term \displaystyle J_{\rm reg} = {\lambda \over 2m} \sum_{j = 1}^n w_j^2 we want to compute the derivative with respect to a specific weight w_j:

{\partial J_{\rm reg} \over \partial w_j} = {\lambda \over 2m} \left( {\partial\over \partial w_j} \sum_{j = 1}^n w_j^2 \right) = {\lambda \over 2m} \left( {\partial\over \partial w_j} \sum_{k = 1}^n w_k^2 \right) = {\lambda \over 2m} \sum_{k = 1}^n {\partial\over \partial w_j} w_k^2 \\ = {\lambda \over 2m} \left( {\partial\over \partial w_j} w_1^2 + \dots + {\partial\over \partial w_j} w_j^2 + \dots + {\partial\over \partial w_j} w_n^2 \right) \\ = {\lambda \over 2m} \left( 0 + \dots + 2 w_j + \dots + 0 \right) = {\lambda \over m} w_j.

Please notice, that I changed the summation index from j to k in the sum to avoid confusion, as the partial derivative is taken with respect to w_j, and using the same index in both the sum and the derivative would lead to ambiguity.

Dividing by m helps to ensure that the gradient step size due to the regularization is of similar scale to the gradient of the loss function.

Topic		Replies	Views
Regularized Logistic Regression Cost Function Supervised ML: Regression and Classification week-3	1	476	March 25, 2023
Logistic Regression Derivative of J(w,b) Supervised ML: Regression and Classification week-3	12	1074	May 16, 2023
Derivative of "Simplified Cost Function" Supervised ML: Regression and Classification week-3	1	586	March 19, 2023
Derivative of the Cost Function in Logistic Regression Supervised ML: Regression and Classification week-3	2	514	August 24, 2023
Derivative of regularization term Supervised ML: Regression and Classification week-3	22	1566	November 6, 2024

Derivative of regularized logistic cost function-- does it need the DIMENSION of the w vector?

Related topics