Hi there,
I would like to ask the following.
When computing the reduction in entropy, so you can choose how to split, I understand that when you split the root node the formula is going to be:
1 - Weighted Average Entropy
We subtract by 1 because the entropy is originally 1 at the root node.
My question is what would the formula be when splitting the other decision nodes of the tree.
I understand how to compute the weighted average entropy but I am not sure if this should be subtracted by 1 as we did for the root node, and if yes what is the reason. I.e., is the entropy at the parent node always 1?
Thanks.