Can we use gradient descent to find the value of the split threshold that gives the highest information gain?

chaohan · August 10, 2022, 7:40pm

Hello,

In the practice quiz, Question 3 asks:

For a continuous valued feature (such as weight of the animal), there are 10 animals in the dataset. According to the lecture, what is the recommended way to find the best split for that feature?

The following option is considered as a wrong answer:

Use gradient descent to find the value of the split threshold that gives the highest information gain.

I’m wondering why this is wrong. I feel that the entropy after splitting a continuous variable can be thought of as a cost function as a function of the split threshold. Can’t we use gradient descent to find the split threshold that minimizes the entropy?

TMosh · August 10, 2022, 8:04pm

It may be possible. But the quiz says to consider what was presented in the lecture.

Topic		Replies	Views
Continuous Value Splitting Advanced Learning Algorithms week-4	2	704	January 4, 2023
Splitting on a Continous Variable for Decision Trees is Inefficient Advanced Learning Algorithms week-4	5	496	October 18, 2022
Information Gain confusion Advanced Learning Algorithms week-4	5	528	September 13, 2022
Information gain calculation problem Advanced Learning Algorithms week-4	5	423	October 24, 2023
Getting a couple errors in C2_W4_Decision_Tree_with_Markdown (with Right Answer) Advanced Learning Algorithms week-4	3	520	April 22, 2023

Can we use gradient descent to find the value of the split threshold that gives the highest information gain?

Related topics