Negative Value of Information Gain

Is it possible that for some feature ‘x’, the split causes the Child Node to have a higher Entropy than the Parent Node, which means that we have a negative value for the Information Gain?

Hi @Ammar_Jawed,

I had never thought of this, and with a 15-minute thinking, I couldn’t think of any trivial cases that we can have a negative information gain (IG). At least, we could not have a split such that both children’s entropy are larger than the parent’s.

I know this is incomplete, so you probably would need to do more research on the internet for any more formal or definitive proof, or you would need to find one example of negative information gain (if it exists which I don’t believe so, but feel free to show me I am wrong).

Cheers,
Raymond

Paste it here before I erase it:

To show the following

At least, we could not have a split such that both children’s entropy are larger than the parent’s

image

From the graph, one of the children must have less entropy than its parent (if not both children have the same entropy as its parent).