Slope dj_dw,dj_db not converging simultaneously,

Vivek_Chaudhary1 · January 4, 2024, 10:35am

dj_dw,dj_db- gradient of weight and scaler as highlighted in lecture class W2 topic- gradient descent

i used gradient descent to implement multiple linear regression. the problem is that the value of gradient dj_dw is converging to zero but at the same time the value of gradient dj_db is not converging to zero and becomes constant.

for above, value of djdw is close to 0 while that of djdb is constant at -290 ishh

Deepti_Prasad · January 4, 2024, 1:08pm

Check if you model is overfitting.

Actually I want Raymond to reply for this, he would be right person to explain this
@rmwkwok can you please have a look.

Also Vivek if it is not course related assignment, then kindly give details about your model, what you are looking for.

Regards
DP

gent.spah · January 4, 2024, 3:15pm

If the linear regression model is y = wx +b, then dj_dw will be changing as you fit the data, these are the tunable weights. The dj_db where b is a constant will be an offset!

TMosh · January 4, 2024, 4:57pm

There’s no rule that says that the weights and bias must converge at the same rate.

rmwkwok · January 7, 2024, 3:42am

Hello @Deepti_Prasad,

When we speak about “large” and “small”, we need to know what we are comparing with.

The cost values (assuming squared cost) are telling us that the averaged error per sample is pretty large (estimate: \sqrt{50000000} \approx 7000).

Assuming that the model has achieved a 5% accuracy, then the labels are in the range of 7000/0.05 = 140,000.

We know that the model is y = \vec{w} \cdot \vec{x} + b , and that the change of b \approx 290\alpha, so whether such change is large or not is a comparison with the labels: \frac{290\alpha}{140000} \approx \frac{\alpha}{480}.

The question is, is \frac{\alpha}{480} large?

Lastly, note that the above are my estimations, but the learner is responsible for providing the actual numbers.

Cheers,
Raymond

(I may not be able to follow up on this thread, please help me, @Deepti_Prasad)

Deepti_Prasad · January 7, 2024, 4:20am

Hello Raymond,

What do you mean by this? Even I couldn’t reply to learner’s query as only his training image was posted without much information about dataset, model and batch size.

From the image what I only understood that the gradient descent learnt enough about the model and lost way due to some error in the analysis, and I can’t interpret as I do not have the whole information about the same.

Regards
DP

rmwkwok · January 7, 2024, 4:42am

I mean I may not be able to respond to the learner’s future feedback promptly, so if they do feedback, please help take care of it.

Cheers,
Raymond

Topic		Replies	Views
C1W3 Error in gradient descent Supervised ML: Regression and Classification week-3	3	512	December 4, 2022
W2 Programming Assignment - dL/db is not converging to 0 Neural Networks and Deep Learning	6	524	July 7, 2021
C1_W3_Lab09_Regularization_Soln incorrect dj_db Supervised ML: Regression and Classification week-3	6	514	September 3, 2022
Help with code for Multivariable Gradient Descent Supervised ML: Regression and Classification week-2	3	343	January 4, 2024
C1_W3_Logistic_Regression-Gradient Computation Supervised ML: Regression and Classification week-3	2	535	July 31, 2022

Slope dj_dw,dj_db not converging simultaneously,

dj_dw,dj_db- gradient of weight and scaler as highlighted in lecture class W2 topic- gradient descent

Related topics