Slide 1 of week 2: Instead of w1X1+w2X2+w3X3…wnXn shouldn’t it be (w1 multiplied by across all values of X1 feature)+(w2 multiplied by across all values of X2 feature)+(w3 multiplied by across all values of X3 feature)+(wn multiplied by across all values of Xn feature) for multiple regression? If my statement is wrong please dont reveal the answer instead a hint would help to further understand the concept. Thank you very much.
can you share a screenshot of the slide in question?
Thanks. So I guess your question is, should x represent a sample, or all samples? Am I right?
So my question is: in ‘w1*x1’ does x1 include all data under that feature? and so on for each of the 'w’s
Alright, since you ask not to reveal the answer, my hint is, your choice on the R.H.S. of the equation shown in the slide depends on your choice on the L.H.S. of the equation, because it is the meaning of “equation” for both sides to be equated.
My second hint is, it’s better not to think that the answer has to be this or that. Sometimes Maths is pretty flexible. One Maths equation can be used to represent one relation, and it can also be used to represent many relations when vectorization is involved. This is why your choice matters.
So, depends on ’ for what ‘w’ we are optimizing… our x in f(x) shall be considered?!
@tennis_geek, my answer is no, but I can’t explain. Once I explain, I will give away the answer. It has nothing to do with optimization.
P.S. I have to leave for an hour, but will check this post again.
So, if f(x1) = w1x1 + b for a simple linear regression
then
for f(x1, x2, x3…xn) Will it be w1xrow1feature1+w2xrow1feature2+…+wnxrow1featuren… is this is how, the multiple regression algorithm shall handle the entire training data with multiple features.
Previously, I was thinking, (w1 xrow1feature1+w1xrow2feature1+w1xrow3feature1…) +(w2 xrow1feature2+w2xrow2feature2+w2xrow3feature2…)…+(wnxrow1featuren+wnxrow2featuren+wn*xrow3featuren…).**
Hello @tennis_geek,
To begin with, I would say the equation was discussed under the context of “describing how you formulate the prediction using weights as parameters and feature value as input”, instead of “model training”.
So in making prediction, we don’t need to consider all samples (or all rows), but just one sample that you want to predict for. So, if you have a multiple linear regression with n features, and you want to make prediction for the first sample (or the first row), then your following formula is correct!
Note that I am saying making prediction, but not training.
Raymond
Thanks much quickly back referenced with the slide and yes it is about prediction part in a multiple features problem.
Will keep this in mind when the ‘training the model part’ comes up in my learning journey for multiple regression problems.
Best