Hi! After this question I understood, that I hadn’t understood regularization regression model.
I was sure, that it increases w. It should be by principles of algebra. Because I dot lambda on sum of w**2. It should be positive number, that will increase J in the end.
So, most probably I didn’t understand, what means “min”. Can you explain me, please, this question?
The question is asking what happens to the weight values when the cost function is minimizied.
The ^2 in the regularization term doesn’t increase the weight values. It computes the squares of the weight values. It doesn’t modify the weight values itself - it just computes the regularized cost.
In the end result, we run an optimizer that modifies the weights to get the minimum cost. Since some of the cost comes from the regularized term, the optimizer has an incentive to learn smaller weight values, in order to reduce the magnitude of the regularized term.
Since lambda is a multiplier on the regularized term, if you make lambda larger, that creates even more incentive for the optimizer to make the weight values even smaller.
How to do? Can you clearlify, please? Or maybe I a bit can’t understand what you mean?
The optimizer uses the gradients of the cost function to find the minimum. Mathematically, it’s like walking down a slope to find the bottom of the valley.
A bit not understand, what is the difference between Gradient Decent and minimization of w and b parameters in this case?
Is Gradient Decent works only with y axis? and not return w and b?
Gradient descent is a method to adjust the w and b values so that they give the minimum cost.
We’re not interested in minimizing w and b, we’re interested in the minimum cost.
oh, I see. So, do you mean, that a cost can be lower with bigger w or b, correct?
Yes, w and b can have any value, small or large, positive or negative or zero, depending on the problem. If large values of w and b give a small cost, we accept that.