Confusion between dimension and line equation

When there n features in the multiple linear regression we often called it n dimensions, because we try to map each feature with a separate dimension.

I see the line equation with multiple features as y = w_1x_1 + w_2 x_2 + ... + w_n x_n + b a bit confusing because line is always one dimensional, I mean it is too confusing I cant even form the question about it :sweat_smile:

Do not confuse the equation of a straight line in 2D with the concept of “linear regression”.
“Linear” refers to the linear combination of the features and weights.
It has nothing to do with the shape of the curve.

1 Like

So you mean this difference between feature and dimension is incorrect Difference between Dimension, Attribute and Feature in Machine Learning - Stack Overflow

I have no opinion about what is posted on StackOverflow.

So I was confused dimension (or rank) of vector with the dimension of coordinate system in space. Typical maths and physics confusion I would say.

Dimension in the vector in case of linear algebra is simply number of components, and in case of machine learning it is called features.

Visualizing the dimensions in the space comes into play when we are trying to project the vectors on the coordinate system.

Hello @tbhaxor,

I think when it comes to discussing a dataset, feature = attribute = column (vector) = dimension = independent variable. Attribute is a pretty cold name (or a software guy would use it?), feature sounds like useful stuff, and column is the language for a table or a 2D matrix. Independent variable is used by maths guy? Dimension may be more used by scientists or when it comes to visualization?

A rank is for describing a matrix, which is the number of linearly independent columns (so we say column here). A full rank matrix is when all of its columns are linearly independent with eaach other.

A Linear regression is, as Tom said, for fitting the best linear combination of features (or the column vectors of the matrix), and it is a line only when there is one feature.

When you have one feature, and one predicted variable, you can use them to form a 2D space and draw a line that represents their relation. When you have two features and one predicted variable, you need to draw a plane in a 3D space.



What a clever response Raymond! I am so impressed

1 Like

Yes, it is all about representation and interpretation. In the coordinate system we can uniquely identify each record as (x_1, x_2, \cdots , x_n) which can also we plotted on n dimensional space.

Because we as humans lacks the ability to visualize more than 3-dimensions, we try explain these in term of different representation like you mentioned (column/attribute/field as some database developer, independent variable / feature as some statistician).

@tbhaxor, exactly. Besides “records”, we have “tuples” (database), “rows” (tables), samples, examples, data points, … Certainly we can speak about their differences, because afterall they have different origins, but we all know what they refer to when we are discussing a dataset.


1 Like

I need to learn linear algebra and clear the concepts after completing this course :sweat_smile: