I did not understand how we use features that were one hot encoded in decision tree. How to do they split at each node with calculating information gain.
Hi @Aman6
Original Dataset:
Numeric | Category |
---|---|
5 | Red |
8 | Blue |
3 | Green |
6 | Yellow |
After One-Hot Encoding and Removing the “Category” Column: |
Numeric | Red | Blue | Green | Yellow |
---|---|---|---|---|
5 | 1 | 0 | 0 | 0 |
8 | 0 | 1 | 0 | 0 |
3 | 0 | 0 | 1 | 0 |
6 | 0 | 0 | 0 | 1 |
after we convet the data we deal with it like any another dataset