Choosing a split via entropy calculations

Please explain what was the need to subtract the weighted avg from H(0.5)? and also how did H(0.5) came?

Advanced learning algorithms;
Module 4;
Video Name:Choosing a split: Information Gain

Hello @gigaGPT

H(0.5) is the purity of the parent node. We need the subtraction because we want to know the change in purity from the parent node to the two child nodes.

This has to do with what the parent node composed of - everything you see in the two child nodes. Therefore, in the parent node, it has 5 cats and 5 dogs, and therefore, it is a H(0.5).

For how 5 cats and 5 dogs gets to H(0.5), you will need to watch the video “Measuring purity” again.