Cant manually calculate np.dot() for first feature vector in C1_W2_Lab02_Multiple_Variable_Soln

Frazer_Mann · December 28, 2023, 7:18am

Hi,

Although I have finished the course I’m in the process of going back through a couple things and manually calculating them. I’m currently on the Multiple Variable Linear Regression, specifically the following equation:

def compute_cost(X, y, w, b): 
    """
    compute cost
    Args:
      X (ndarray (m,n)): Data, m examples with n features
      y (ndarray (m,)) : target values
      w (ndarray (n,)) : model parameters  
      b (scalar)       : model parameter
    Returns:
      cost (scalar): cost
    """
    
    m = X.shape[0]
    cost = 0.0
    for i in range(m):                                
        f_wb_i = np.dot(X[i], w) + b           #(n,)(n,) = scalar (see np.dot)
        cost = cost + (f_wb_i - y[i])**2       #scalar
        cost = cost / (2 * m)                      #scalar    
    return cost

If I set m = 1 and run the modified code:

    m=1
    i=0
    cost = 0.0                             
    f_wb_i = np.dot(X[i], w) + b           #(n,)(n,) = scalar (see np.dot)
    return f_wb_i

Then i get 459.99999761940825 but in excel i get 457.2311368

X = [2104, 5, 1, 45] Note I’m only showing the first feature vector rather than all 3.
y = [460, 232, 178]
w = [0.39, 18.75, -53.36, -26.42]
b = 785.1811367994083

Calculating WX
My understanding of the dot product was X0*W0 + X1*W1 + Xn*Wn

so i get:
820.56+93.75-53.36-1188.9 = -327.95

+b to get the model for when i = 0
=-327.95 + 785.18
= 457.23

I’m confused as to why np.dot() is getting a different value and where i have screwed up.

rmwkwok · December 28, 2023, 1:53pm

That’s right, @Frazer_Mann! (I added the “…”)

Btw, it’s better to verify with integer values (except zeros and ones). Sometimes, even you print a floating point value, it does not guarantee to show you all decimal places, and if there is missing decimal place in your calculation, the result can be less precise.

Cheers,
Raymond

TMosh · December 28, 2023, 6:18pm

You posted in “MLS Resources”, but that’s not the best place for a question about a specific technical item in the course.

I’ve moved your thread to the correct location.

TMosh · December 28, 2023, 6:56pm

Try your spreadsheet calculation using the full-resolution values for w and b.
You can find them in the 4th notebook cell:

Frazer_Mann · December 29, 2023, 9:06pm

Thanks very much.

Frazer_Mann · December 29, 2023, 9:19pm

ok, so i think its a rounding error.

I have manually set m to 1 so it only does the first feature vector and ive asked it to return

f_wb_i

the returned value is 459.9999976…

but if i declare the arrays as float rather than int64 (which is what it said they were when i queried the datatype, i get the same value as what i get in excel.

Im a little surprised a rounding error made such a big difference.

How do you get around the fact that numpy doesnt support decimal datatypes when you’re dealing with financial or engineering data?

Thanks again for your help.

TMosh · December 29, 2023, 9:20pm

This is not a fact. Numpy works perfectly well with floating point values.

Frazer_Mann · December 29, 2023, 9:26pm

I thought there was a difference between float and decimal datatypes? (You know alot more about this than i do so im guessing im mistaken)

TMosh · December 29, 2023, 11:15pm

Depends on what you mean by “decimal”.

Do you mean “integer”?

Integers are just simplified floating point numbers.

Frazer_Mann · December 30, 2023, 12:40am

I’m use to C# where int, float, double and decimal were all distinct numerical data types and the calcs i was planning to do were going to require a large number of decimal places. Ive done some more googling and it appears as though there are only int and float data types in python and it seems like i should be using float 32 for when i finally get onto neural networks assuming i want to maintain as much precision as possible?

Thanks again for your time / correcting my earlier post. Hope you have a great weekend

Topic		Replies	Views
Lab 2 [C1_W2_Lab02_Multiple_Variable_Soln] Supervised ML: Regression and Classification week-2	9	614	June 25, 2022
Course 1 Week 2 Logistic Regression Cost function Neural Networks and Deep Learning	4	712	September 28, 2021
Week 2, Exercise 5, computing cost using np.dot Neural Networks and Deep Learning	29	2410	May 22, 2024
Is this vector implementation correct? Supervised ML: Regression and Classification week-3	2	497	June 29, 2022
Week 2 Logistic Regression Exercise 5 Propagate Neural Networks and Deep Learning	9	583	June 7, 2022

Cant manually calculate np.dot() for first feature vector in C1_W2_Lab02_Multiple_Variable_Soln

Related topics