-(Multi)variate intuition (?)

@rmwkwok Dear Raymond,

Yes, you are correct, trained all at the same time as one model.

Like I said, your idea was interesting for me to think about, but not what I was talking/asking about.

For one thing, I feel it is super important to be clear when I say ’ β’ (as in ‘Beta’) I am totally not talking about the ‘b’ as the ‘intercept term’ (i.e. as in Y = mX + b), I am talking about the regression coefficent.

My R skills, at present, are probably stronger than similar methods in SKLearn, so for R I’d write:

lm(Y ~ house_price + house_color, data = X)

Assuming of course I had a data set X with these two columns.

I mean, possibly I could just try and toss at it:

lm(Y ~ ., data = X)

And, actually I mean that would also (programatically) probably work;

Yet my point is, where I am still learning.

Even in the article previously linked I know they express only two ‘beta terms’-- But in a traditional regression model you could have, say, 12, or even hundreds of those.

Thanks everyone for the help, but if I am so totally off the mark please explain to me just this:

How does y = wx + b possibly equal y = (w_1 x_1) + (w_2x_2) + b
?

I mean there is an addition sign between the multiply transactions…

Because w is a 2 x 1 vector and x is a 2 x 1 vector and that is what a dot product multiply does.

1 Like

w and x are both vectors.

The definition of a dot product is that it involves elementwise multiplication followed by addition as a unified operation.

Well, okay guys, I guess I just asked a ‘stupid question’ [while still learning] to pave the path for others…

I think you are throwing a really nice question, I am not quite sure about the answer, buy I think there are several things you can consider: firstly, before you input all your data into your network, what is the data looking like? what are the features? I think there are different kinds of ways you can explore with in the first hand before you train the model. then, you can consider whether use some feature selection techniques or regularization to showcase the most critical features in your training process. Then, you can generate a good feature set based on your explorations. Finally, for me, I think neural network is still something similar to a box, which means, you can use it anyway without seeing the details, so tossing all your data into X might be useful in places.

1 Like

Well I get that part, but that is between two matrices/vectors. I didn’t realize it is ‘totally communitive’, as in, across multiple terms.

I’m not familiar with the term “communitive”. What does that mean?

If you mean commutative, both multiplication and addition are commutative, the last time I heard. Applying that to the definition of the dot product operation between two vectors that will be commutative as well. Why is that important or relevant here? Are you attempting to make some kind of symmetry argument?

Note that the formula works equally well regardless of how many “features” or elements are in each input sample vector x. Of course the size of the weight vector w must match the size of the sample vectors.

No problem, I have never thought your \beta as b. Like I said: image and image

Very well! I do not use R, but I took the liberty to check out some R documentation, and found out that your R script does the same thing as mine (the lower right corner):

Y is equal to a weight w_1 times house_size plus w_2 times house_color plus b, where w_1, w_2 and b are assumed by R even you do not specify them. It is R’s job to fit the values of w_1, w_2 and b for you. You should see them in your result after running that R program.

As explained by Paul and Tom, w is a vector, x is a vector, and all other symbols are scalars. Below shows you the relation between them:

Finally, @Nevermnd, even from your R code, it is still showing just the same model as mine, so I believe that, all this time you have been correctly telling us how we model things, and we do things in the same way, but it is just that we have different ways of using symbols. Such difference is important, because it can convey a different message preceived by the others as a different idea.

Particularly, your image was affected by the R’s way of declaring a model → image, I suppose? Please be noted that even R does not actually combine them. They keep them separate at all time. As I said, that R code implies the following:

image

Cheers,
Raymond

2 Likes

And, I am sorry Paul-- assume the situation I presented:

So you are suggesting

Yes, I meant ‘commutative’-- I mean even Prof Ng. has made some mistakes/corrections to his lectures at times-- And he obviously a much smarter guy than I am, so I am not worried about being correct ‘absolutely’ 100% of the time.

And I mean yes, of course, when it comes to standard addition/multiplication they are always commutative-- But, my point was more towards my understanding is this is not the case when it comes to matrix multiplication/dot products (except for in very special cases, such as with itself or the identity matrix). There the order matters.

@rmwkwok Dear Raymond,

Thank you for your detailed analysis/follow-up. I mean I first learned regression a long time ago in Econometrics.

And I think your explaination gets to the heart of my original confusion–

So if we do an lm(Y ~ house_size + house_color, data = X) is the ‘plus’ (+) sign more of a sort of ‘programming convention’ for adding additional terms ?

I had always thought (presumably reasonably) that it was conveying an actual math operation. But maybe not…

1 Like

Hello @Nevermnd,

It is R’s convention. The developers of R are free to choose whatever way they like. We do not do things like that with Python, and I have never seen something like that elsewhere too.

No. It is not. :wink:

As a guy in data science, from time to time I have to use new libraries and the first thing to do is to read its documentation to understand the intention of the library’s developers. What I can do now is to share with you how I discovered that, just in case it can give you some impression that may help you in the future:

  1. I googled “R lm function”
  2. I found this page which is the first result from Google. Its title includes the keyword “documentation”.
  3. Your Y ~ house_size + house_color is the first argument, so I looked at lm’s first argument too, which is
  4. It asked us to look at “Details”, so did I:

    Above is only a part of it.
  5. Then I googled for some examples with keywords “R lm example”
  6. And I read 2-3 pages to finally confirm my understanding.
  7. If you look at step 3 again, “formula” has a link that will bring you to more detailed explanations on what symbols other than “+” (plus) you can use there. For example, you can use “*” (asterisk) but it does not mean multiplication at all.

:wink:

Cheers,
Raymond

1 Like

Yes, it is the case that matrix multiplication is not commutative in general. Pure vector dot product (“vector on vector”) is commutative if you treat the vectors as 1D vectors, not “row” or “column” vectors.