The linear regression is the supervised learning which means regardless of the independent variables, there would be 1 dependent variable. The algorithm aims to apply analysis and learning methods of at least two variables (independent x and the dependent y) for example Celsius to Fahrenheit. Shouldn’t it be called **bivariate**?

edited, see answers below.

Hi there,

I noticed that there are slightly different definitions in maths and stats.

But let me try to answer it as follows:

As you pointed out you only have one feature vector (1D) to describe your label.

The mathematical term univariate describes the dependency of only one variable. In statistics this term has a similar meaning: the „measurement variable“. In our supervised learning model, we assume that there is only one independent variable which is 1D. This is the case in your model.

There are also different definitions apparently: the question is always whether it refers to only the independent variable or all variables, see also:

However: It seems that the majority of sources echo your understanding of the term bivariate!

Thanks for pointing it out.

In conclusion:

- if you hear the term univariate regression you can assume there is only in independent variable: the 1D feature vector to predict the label
- if you hear bi-variate regression, better clarify how many independent variables there are since several people might have a different understanding here.

Also: The mapping from Celsius to Fahrenheit is deterministic: °F = (°C × 9/5) + 32

So you would have one random variable to measure which is univariate.

Best regards

Christian

So can we say since we are applying some stats on the one variable (independent vector), that is why it is univariate?

If you are applying statistics on one variable, yes this per definition univariate.

Also:

In statistics, a univariate distribution characterizes one variable, although it can be applied in other ways as well. For example, univariate data are composed of a single scalar component. In time series analysis, the whole time series is the “variable”: a univariate time series is the series of values over time of a single quantity. Correspondingly, a “multivariate time series” characterizes the changing values over time of several quantities.

In some cases, the terminology is ambiguous, since the values within a univariate time series may be treated using certain types of multivariate statistical analyses and may be represented using multivariate distributions.

Best regards

Christian

Hi there,

sometimes a picture is worth a thousand words. Found this one today and had to share it with you,

@tbhaxor:

see also this thread.

Best regards

Christian

Moved this thread to course 1 week 1.