What is the advantage we are gaining by splitting the single feature with 3 categories i.e. Ear Shape into Pointy ears, Floppy ears, Oval ears using one hot encoding as opposed to having three branches in the decision tree when the node is split by the feature Ear shape, where each branch will be assigned a distinct value (category value) of the Ear Shape.
One advantage I could notice is that the Decision tree built with one hot encoding will be simple to understand and analyze with a maximum of 2 branches per node, is there any other advantage?
You can’t have text in the rows when fitting a model.
One hot encoding converts the categorical value to numeric as a “dummy variable”.
Hi @abhilash341,
Besides @dan_herman’s answer, what other problems can you think of without one-hot encoding them?
Let’s say we use numbers to represent ear shape: 1, 2, 3 stands for pointy, oval, and floppy respectively. Given that you know how splitting works (as lecture explained that part), what are the possible splits?
What if you change the order of those shapes, that now, 2, 1, 3 represents pointy, oval, and floppy respectively?
Cheers,
Raymond
Given that you know how splitting works (as lecture explained that part), what are the possible splits?
What if you change the order of those shapes, that now, 2, 1, 3 represents pointy, oval, and floppy respectively?
this is my intuition.
Hello @abhilash341,
This is not how most decision tree packages work, they do not give you a three-way split, instead, just a two-way split. For example, it can split at between 1 and 2 so that you have a group of (1) and a group of (2, 3). It can split at between 2 and 3, so that you have a group of (1, 2) and a group of (3).
So, do you see any problem? Is there anything that only one-hot encoded features can give you that the above two-way splits can’t?
Raymond
Generally you should not use an enumerated list to identify different classes. This implies an unnecessary (and probably incorrect) linear relationship between the classes.