Why we need to add regularization lambda function into the cost function since we already did the regularization in the gradient descent

Bio_J · January 15, 2024, 4:05am

week-3
Like in the gradient descent we are already doing this, isn’t the w we get is the smaller one? After each literation?

Why we are still need to add this into the cost function? Someone explain it in a simple way in math plz…

TMosh · January 15, 2024, 4:25am

The regularized part of the cost function is to encourage the learned weights to be slightly smaller. This helps prevent overfitting.

The part circled in red in your first image is the portion of the gradients that account for the regularization term in the cost function.

Bio_J · January 15, 2024, 4:40am

But like the idea of gradient descent is to find the best fit “w” and “b”, so after we apply regularization in the gradient descent, in the end isn’t we already get the regularized “w”? Then why don’t we just use the regularized “w” in the cost function without lambda?

TMosh · January 15, 2024, 6:12am

The formula you posted is not regularized. It’s extremely likely to cause overfitting of the training set.

Johnson_Abiola · January 15, 2024, 8:36am

I think, the gradient descent “regularization” is derived from the cost function regularization. So to apply regularization to your algorithm, (to avoid overfitting) you have to add the regularization term to both the cost function and the gradient descent function. (the gradient descent function has components of the derivative of the cost function anyways. so they are related!). I hope this helps!

TMosh · January 15, 2024, 9:39am

It all starts with the cost function.
That should include a regularization term.
Then since the gradients are the partial derivatives of the cost function, this gives you the expression for the regularized gradients.

Topic		Replies	Views
Why does regularization reduce w? Improving Deep Neural Networks: Hyperparameter tun	7	585	August 18, 2023
Regularized Reg Using Grad Descent Deteriorates Performance Supervised ML: Regression and Classification week-3	3	457	April 27, 2023
Why add lamda factors in cost function instead of subtraction? Supervised ML: Regression and Classification week-3	2	497	November 10, 2022
Will Lambda reduce the size of the w parameters? Supervised ML: Regression and Classification week-3	7	493	May 6, 2023
Why is the value of regularization parameter(lambda) the same for all the weight parameters Supervised ML: Regression and Classification week-3	3	520	July 28, 2022

Why we need to add regularization lambda function into the cost function since we already did the regularization in the gradient descent

Related topics