I’ve just finished course 1 in the specialization, and tried to implement a polynomial regressor with feature scaling and regularization.
I tried to fit the model in these input values:
features = np.array([0.5, 3, 5, 12, 100])
targets = np.array([-0.25, 6, 20, 132, 9900])
These values lie exactly on the curve x^2 - x
.
Without regularization (regularization parameter = 0), I get this result:
Cost: 1.0057510780545684e-19
w: [-1. 1.]
b: -6.161826604511589e-10
This is good so far. But when I set the regularization term to 1, I get this result:
Cost: 79600.11697910336
w: [46.0003336 0.45965975]
b: -32.76440392902737
I’m okay with the cost going up, after all, we’re reducing overfitting, which should result in an increase in the cost function.
But, the weights?.. I would expect all weight to be smaller. Even though the weight of the term x^2
got smaller, the term for x
got much bigger.
Is this a correct behavior? I suspect that since I’m doing feature scaling, the values are getting much smaller, which might have an impact.
If I set the regularization parameter to something much smaller like 0.001
, I get this result:
Cost: 37.974732753837024
w: [1.59413222 0.97505558]
b: -11.740479905065058
I have a feeling that this is not correct, and that it’s a bug.
My code for gradient descent:
{Moderator’s Edit: Code Removed}
If this is not a bug, how is this a correct behavior? Why is one weight getting much bigger while the other is getting smaller? Why not the two together?
And, is there a way to make sense of the weights and determine whether the current value of the regularization parameter is good or not? I mean, is there a threshold or something between underfitting and overfitting that I should put in mind, or is it just by experimentation?