I was sure, that it increases w. It should be by principles of algebra. Because I dot lambda on sum of w**2. It should be positive number, that will increase J in the end.

So, most probably I didn’t understand, what means “min”. Can you explain me, please, this question?

The question is asking what happens to the weight values when the cost function is minimizied.

The ^2 in the regularization term doesn’t increase the weight values. It computes the squares of the weight values. It doesn’t modify the weight values itself - it just computes the regularized cost.

In the end result, we run an optimizer that modifies the weights to get the minimum cost. Since some of the cost comes from the regularized term, the optimizer has an incentive to learn smaller weight values, in order to reduce the magnitude of the regularized term.

Since lambda is a multiplier on the regularized term, if you make lambda larger, that creates even more incentive for the optimizer to make the weight values even smaller.

The optimizer uses the gradients of the cost function to find the minimum. Mathematically, it’s like walking down a slope to find the bottom of the valley.

Yes, w and b can have any value, small or large, positive or negative or zero, depending on the problem. If large values of w and b give a small cost, we accept that.