Continuous Value Splitting

Kaitlyn_Hu · January 3, 2023, 11:26pm

I have a question related to the practice quiz. The question is “For a continuous valued feature (such as weight of the animal), there are 10 animals in the dataset. According to the lecture, what is the recommended way to find the best split for that feature?”

The correct answer is “Choose the 9 mid-points between the 10 examples as possible splits, and find the split that gives the highest information gain.”

I think this method maybe good for small dataset. If there are millions of different weights, would this method still be a recommended one? I may not fully understand this question. Can someone enlighten me.

pastorsoto · January 4, 2023, 12:21am

Yes, I think you are correct, that approach won’t be practical for large datasets, in this case use another algorithm such as binary search to find the best possible split that maximize information gain can be better for this problem.

Alternatively, you could use a decision tree algorithm that uses a different approach to find the best split for each feature. These algorithms typically build the tree in a top-down, greedy manner, choosing the split that gives the highest information gain at each step. They can be more efficient than using binary search, especially for larger datasets, but they may also be more prone to overfitting.

Kaitlyn_Hu · January 4, 2023, 3:48am

Thank you very much!

Topic		Replies	Views
Splitting on a Continous Variable for Decision Trees is Inefficient Advanced Learning Algorithms week-4	5	496	October 18, 2022
Can we use gradient descent to find the value of the split threshold that gives the highest information gain? Advanced Learning Algorithms week-4	1	517	August 10, 2022
Bisection search for continuous valued features case? Advanced Learning Algorithms week-4	1	504	August 20, 2022
How to find the threshold for splitting Continuous valued features Advanced Learning Algorithms week-4	1	223	March 20, 2024
C2_W4_Lab_02_Tree_Ensemble. Continuous valued features Advanced Learning Algorithms week-4	2	475	February 12, 2023

Continuous Value Splitting

Related topics