When watching the video, I learned that when calculating information gain to split a dataset using a continuous attribute (not discrete values that can use one-hot encoding), the decision tree algorithm selects various thresholds, computes the information gain, and then chooses the threshold that provides the highest information gain. However, does this approach work correctly if the dataset to be split contains more than two distinct classes (multi-class classification)? It seems quite unusual for a single threshold to separate 3, 4, 5, or more different classes. Is it normal to apply decision trees to multi-class classification tasks?
Link of the video: https://www.coursera.org/learn/advanced-learning-algorithms/lecture/a4v1O/continuous-valued-features
Yes they be used for multiclass classification, I don’t remember now those videos you are taking but here is another explanation of Decision Trees Multiclass Classification:
1 Like
Thank you very much, I will study this topic more.
1 Like