# dj_dw,dj_db- gradient of weight and scaler as highlighted in lecture class W2 topic- gradient descent

i used gradient descent to implement multiple linear regression. the problem is that the value of gradient dj_dw is converging to zero but at the same time the value of gradient dj_db is not converging to zero and becomes constant.

for above, value of djdw is close to 0 while that of djdb is constant at -290 ishh

1 Like

Check if you model is overfitting.

Actually I want Raymond to reply for this, he would be right person to explain this
@rmwkwok can you please have a look.

Also Vivek if it is not course related assignment, then kindly give details about your model, what you are looking for.

Regards
DP

If the linear regression model is y = wx +b, then dj_dw will be changing as you fit the data, these are the tunable weights. The dj_db where b is a constant will be an offset!

Thereâ€™s no rule that says that the weights and bias must converge at the same rate.

When we speak about â€ślargeâ€ť and â€śsmallâ€ť, we need to know what we are comparing with.

The cost values (assuming squared cost) are telling us that the averaged error per sample is pretty large (estimate: \sqrt{50000000} \approx 7000).

Assuming that the model has achieved a 5% accuracy, then the labels are in the range of 7000/0.05 = 140,000.

We know that the model is y = \vec{w} \cdot \vec{x} + b , and that the change of b \approx 290\alpha, so whether such change is large or not is a comparison with the labels: \frac{290\alpha}{140000} \approx \frac{\alpha}{480}.

The question is, is \frac{\alpha}{480} large?

Lastly, note that the above are my estimations, but the learner is responsible for providing the actual numbers.

Cheers,
Raymond

Hello Raymond,

What do you mean by this? Even I couldnâ€™t reply to learnerâ€™s query as only his training image was posted without much information about dataset, model and batch size.

From the image what I only understood that the gradient descent learnt enough about the model and lost way due to some error in the analysis, and I canâ€™t interpret as I do not have the whole information about the same.

Regards
DP

I mean I may not be able to respond to the learnerâ€™s future feedback promptly, so if they do feedback, please help take care of it.

Cheers,
Raymond