Decision trees, one-hot encoding, and multicollinearity

michald · February 12, 2024, 9:48pm

In the “Using one-hot encoding of categorical features” video, Andrew talks about using one-hot encoding as a way of dealing with multilabel categories and creating k features where k is the number of labels.

I recall from one-hot encoding for regressions that we’d generally use k-1 features to prevent multicollinearity. Is multicollinearity something that is not usually considered with decision trees because of the low interaction between the splits?

TMosh · February 12, 2024, 11:24pm

I don’t recall anywhere in the course that (k-1) is used. If there are ‘k’ labels in the output, they’re represented as ‘k’ one-hot variables.

Can you give more details on the (k-1) topic? A video title and time mark, for example.

michald · February 13, 2024, 12:00am

Sorry, I didn’t express myself clearly - k-1 didn’t come from this course.

IIRC from econometrics classes, it is standard practice for (linear?) regressions to drop one of the dummy variables to avoid the dummy variable trap and subsequent multicollinearity. If k variables are present, then we know that there will always be one with value 1 and the rest 0, alternatively their sum is always 1, which would break the assumption of the independently part in i.i.d.

I assume that multicollinearity with decision trees is not an issue because we generally split on a single dummy variable, but would like to confirm that.

TMosh · February 13, 2024, 12:14am

Using (k-1) is not common in machine learning.

rmwkwok · February 13, 2024, 2:56am

Yes, I agree with you on that - we consider only one feature at a time, and we cannot always know beforehand which feature (among all of the one-hot features) carries the least information that can be best dropped.

Cheers,
Raymond

michald · February 13, 2024, 5:06pm

Got it, thanks!

michald · February 13, 2024, 5:07pm

That makes sense, thanks for confirming that.

Topic		Replies	Views
Why one hot encoding is needed? Advanced Learning Algorithms week-module-4	3	28	June 1, 2025
Using one-hot encoding of categorical features Advanced Learning Algorithms week-module-4	5	345	March 12, 2024
Why applying one-hot encoding Advanced Learning Algorithms week-module-4	9	302	December 4, 2023
Isn't it a BAD idea to use one-hot encode for Decision Tree models? Advanced Learning Algorithms week-module-4	6	2047	December 1, 2022
One hit encoding Decision Tree Data Preprocessing data Advanced Learning Algorithms week-module-3	1	392	November 15, 2023

Decision trees, one-hot encoding, and multicollinearity

Related topics