Will Lambda reduce the size of the w parameters?

someone555777 · April 22, 2023, 3:40pm

Hi! After this question I understood, that I hadn’t understood regularization regression model.

I was sure, that it increases w. It should be by principles of algebra. Because I dot lambda on sum of w**2. It should be positive number, that will increase J in the end.

So, most probably I didn’t understand, what means “min”. Can you explain me, please, this question?

TMosh · April 22, 2023, 5:18pm

The question is asking what happens to the weight values when the cost function is minimizied.

The ^2 in the regularization term doesn’t increase the weight values. It computes the squares of the weight values. It doesn’t modify the weight values itself - it just computes the regularized cost.

In the end result, we run an optimizer that modifies the weights to get the minimum cost. Since some of the cost comes from the regularized term, the optimizer has an incentive to learn smaller weight values, in order to reduce the magnitude of the regularized term.

Since lambda is a multiplier on the regularized term, if you make lambda larger, that creates even more incentive for the optimizer to make the weight values even smaller.

someone555777 · April 22, 2023, 6:01pm

How to do? Can you clearlify, please? Or maybe I a bit can’t understand what you mean?

TMosh · April 22, 2023, 6:52pm

The optimizer uses the gradients of the cost function to find the minimum. Mathematically, it’s like walking down a slope to find the bottom of the valley.

someone555777 · May 1, 2023, 5:32pm

A bit not understand, what is the difference between Gradient Decent and minimization of w and b parameters in this case?

Is Gradient Decent works only with y axis? and not return w and b?

TMosh · May 1, 2023, 6:14pm

Gradient descent is a method to adjust the w and b values so that they give the minimum cost.

We’re not interested in minimizing w and b, we’re interested in the minimum cost.

someone555777 · May 6, 2023, 12:10pm

oh, I see. So, do you mean, that a cost can be lower with bigger w or b, correct?

saifkhanengr · May 6, 2023, 12:14pm

Yes, w and b can have any value, small or large, positive or negative or zero, depending on the problem. If large values of w and b give a small cost, we accept that.

Topic		Replies	Views
Question on how Lambda works Supervised ML: Regression and Classification week-3	9	505	February 22, 2023
Why does regularization reduce w? Improving Deep Neural Networks: Hyperparameter tun	7	580	August 18, 2023
Explanation of Lambda in Regularization of Linear Regression Cost Function Supervised ML: Regression and Classification week-3	2	72	July 21, 2024
Large value of lambda in Regularization Supervised ML: Regression and Classification week-3	14	927	December 6, 2022
Why we need to add regularization lambda function into the cost function since we already did the regularization in the gradient descent Supervised ML: Regression and Classification week-3	5	422	January 15, 2024

Will Lambda reduce the size of the w parameters?

Related topics