Feature Engineering and Polynomial Regression - Trouble validating results

David_Levy1 · May 31, 2023, 8:05pm

Hi there. While doing this lab, I set about validating the cost for the function f_w_b(x) = w*x^2 + b at w=1 and b=0.049 in Excel. The cost I calculated(0.4522) is quite far from what I’m seeing in this code from the lab(0.208962 at iteration 9000):

X = X.reshape(-1, 1) #X should be a 2-D Matrix
model_w,model_b = run_gradient_descent_feng(X, y, iterations=10000, alpha = 1e-5)

plt.scatter(x, y, marker=‘x’, c=‘r’, label=“Actual Value”); plt.title(“Added x**2 feature”)
plt.plot(x, np.dot(X,model_w) + model_b, label=“Predicted Value”); plt.xlabel(“x”); plt.ylabel(“y”); plt.legend(); plt.show()

My result and work are shown below. Any idea where the difference comes from?

Thanks,
David

David_Levy1 · May 31, 2023, 8:26pm

It’s the prediction error. The third column minus the second column. The 5th column is the squared error, the square of the fourth column. Thanks for taking a look!

TMosh · May 31, 2023, 9:03pm

I think your “Prediction f” values are all incorrect. It looks like you used b = 0.049 instead of 0.49.

TMosh · May 31, 2023, 9:06pm

If I use w = 1 and b = 0.49, I get a cost value of 0.13005.

David_Levy1 · May 31, 2023, 9:10pm

The result of the gradient descent with 10,000 iterations is b = .049. I agree that b = 0.49 results in a much lower cost, but I’m just trying to validate the results for those statements(ie that the cost at b = 0.049 is less than the cost at iteration 9000, 0.208962). Since my cost is higher, I’m assuming that either there is something wrong with the code or there is something wrong with my method(which is probably the case, but I want to know why this is not correct).

TMosh · May 31, 2023, 9:12pm

Are you trying to fit a straight line to a parabola (f = x^2 + 1)?

This would be a little simpler to verify if you just list the two columns: the ‘x’ value and the corresponding ‘y’ value. The table you gave doesn’t define ‘y’.

TMosh · May 31, 2023, 9:14pm

If you have ‘x’, then you already know ‘x^2’. Then you’re just doing a linear regression on a linear function (for x^2 and y), and the results are b = 1 and w = 1.

This is because I think you defined ‘y = x^2 + 1’.

I think your learning rate is way too small to get a good solution using gradient descent.

It would be useful to plot the cost history during gradient descent - this will let you see if the cost has stabilized at a minimum.

David_Levy1 · May 31, 2023, 9:27pm

I’m sorry for the confusion. y is the actual(second column).

I’m not actually doing any regression or gradient descent in the excel sheet, just checking the results at a particular point in the lab that yields b = .049 after 10,000 iterations of GD trying to fit w*x^2 + b to x^2 +1. So I’m just finding the cost function for f = x^2 + 0.049 as a fit for y = x^2 + 1.

TMosh · May 31, 2023, 9:44pm

Plugging in w=1 and b=0.049, I get cost = 0.45220

TMosh · May 31, 2023, 9:46pm

The lab gives you the results from gradient descent after 10000 iterations with way too small a learning rate.

David_Levy1 · May 31, 2023, 11:11pm

Yes, I get the same, but the code outputs a different, smaller cost (as mentioned in my OP, 0.208962) at iteration 9000. At iteration 10,000, which yields b = .049, I believe the cost should be even lower than 0.20896, not higher(at 0.4522). That is my issue here.

TMosh · May 31, 2023, 11:37pm

Perhaps the w values are different.

rmwkwok · June 1, 2023, 1:52am

Hello @David_Levy1,

Great job verifying!

Here are my suggestions:

print model_w, model_w[0], model_b out. You will see something different.
re-compute the cost value at the jupyter notebook using the model_w and model_b returned
update your excel sheet with the latest w and b value, and compare its result with step 2.

Note also that although it prints the cost value at the 9000th iteration, it did not print the w and b at the 9000th iteration. It only prints the w and b at the last iteration (which is the 9999th iteration). Therefore, it is wrong to compare the cost at 9000th with the w and b at 9999th. If you want to examine how the printing works, go to lab_utils_multi.py, and in fact, you might also modify the print to include also the w and b at the 9000th iteration.

Cheers,
Raymond

David_Levy1 · June 1, 2023, 2:14am

Thank you Raymond. I haven’t had a chance to look at it yet, but regarding your comment at the end that it is wrong to compare cost at 9000th iteration to cost at 9999th, shouldn’t the cost be monotonically decreasing, and therefore I should expect cost_9999 < cost_9000?

rmwkwok · June 1, 2023, 2:16am

Hello @David_Levy1,

That’s a reasonable argument, but we need to do everything correctly including all the three steps I have suggested, otherwise, we will discuss based on wrong observations. Let’s hold on the discussion, make sure everything is right, then we discuss.

Cheers,
Raymond

rmwkwok · June 1, 2023, 2:37am

Hey @David_Levy1,

As a matter of fact, if we think reversely, your reasonable argument suggested that something was wrong in the observations, didn’t it? Great argument!!

Cheers,
Raymond

David_Levy1 · June 1, 2023, 2:28pm

TMosh - my apologies, I just noticed that I wrote “0.49” instead of “0.049” in my initial post. I have since corrected it, though I am still perplexed about the discrepancy between the lab and my result.

David_Levy1 · June 1, 2023, 2:44pm

Hi Raymond. I do see that model_w and model_w[0] print with much different levels of precision, which is interesting.

Prior to squaring x:

model_w: [18.7]
model_w[0]: 18.69806953562155
model_b: -52.08341025448668

After squaring x:

model_w: [1.]
model_w[0]: 1.0041749387228514
model_b: 0.04896443659681639

Although these parameters are very close to what I had in my excel sheet, I updated them, and lo and behold, the resulting cost was, consistent with convergence, lower than cost_9000! It’s a shame that the code displays figures that are so rounded that they result in significant cost differences, but I suppose this will ultimately make me a better python coder.

Thanks for your help!

David

rmwkwok · June 1, 2023, 8:42pm

Hello David @David_Levy1,

Wonderful!

In fact, I was also confused at first, and so I had to do some investigations too. I only realized the rounding thing after printing out w * x^2 and found that the result wasn’t consistent with w = 1.

I really do think that your earlier argument was great. We need some leads to investigate, mine was merely that the Excel result should be the same as the Python result. Yours actually showed a deeper understanding.

With such a good sense, I am sure, as you said, you will be a better coder, and it is also essential to be a good coder to do data science on computers.

Cheers,
Raymond

PS: In case you are interested, with my lead, what I intended to do was to repeat your excel’s step-by-step in Python. That’s why I did w*x^2. I repeated that because I wanted to know at which step their results started to show difference.

Topic		Replies	Views
What is wrong here? Linear Regression Practice Lab Week 2 Supervised ML: Regression and Classification week-2	6	537	November 23, 2022
C1_W2_Linear_Regression_EX1_error Supervised ML: Regression and Classification week-2	6	334	December 27, 2023
C1_W1_lab05: Linear regression code questions Supervised ML: Regression and Classification week-1	4	631	September 21, 2022
[Question/Validation] Negative J(w,b) in the lecture photo Supervised ML: Regression and Classification week-3	7	333	October 13, 2023
Problem in week 3 practical lab Supervised ML: Regression and Classification week-3	17	605	February 16, 2023

Feature Engineering and Polynomial Regression - Trouble validating results

Related topics