Finding variance in decision tree leaf nodes

Alex_Schlieck · August 7, 2022, 11:11pm

Maybe I missed it, but I can’t find where the videos define the formula for finding the variance of values within the leaf nodes. (This is for the optional week 4 video on regression trees.) In the video Andrew introduces the concept of variance, says not to worry about the equation for that slide, then fills in all the values. But he never goes back and gives the formula later.

SamReiswig · August 7, 2022, 11:55pm

Hi!

The variance for the node would be \dfrac{\sum_{i=1}^n(x_i - \mu)^2}{N}
Where \mu is the mean of the values of the node, x_i is an individual value and N is the number of values for that node. When splitting based on variance the idea is to make splits so that the variance of child nodes gets closer and closer to 0.

Ginny_Khue_Dang · August 5, 2023, 1:39pm

Thanks! This is exactly what I’m looking for.

Christina_Fan · April 25, 2024, 11:54pm

Hi Sam, I tempted to calculated the first variance using the formula above. why I got the result 1.17 instead 1.47.
Please note I used the 5 samples (7.2,9.2,8.4,7.6,10.2) N=5 and u=8.52 in my calculation, basically I calculated the square of each number minus u then sum up the total then divided 5.

I also tried on the 2nd sets of data, got a variance 17.49 instead 21.87. Could you shed a light on where I did wrong?

Thank you
Christina

rmwkwok · April 26, 2024, 1:06am

Hello @Christina_Fan,

We probably have learned to compute variance in the left way, but sometimes people choose to use the other way.

If the formula is sufficient for now, then it is good.

If you wonder about why to divide by n-1 and want to get into the thinking mode of a statistican, then as a starting point, you might read the first three paragraphs in this section of wikipedia or google “population variance vs. sample variance” for some materials of your learning style. However, you don’t need to get to the bottom of this for completing this specialization or to use decision tree in your work.

Cheers,
Raymond

TMosh · April 26, 2024, 1:24am

It’s a statistics thing.

If you have a sample, the divisor is (N-1).
If you have the entire population, the divisor is (N).

Topic		Replies	Views
Why is the sample variance used rather than the variance of the mean in choosing the decision tree weights? Advanced Learning Algorithms week-module-4	2	22	January 5, 2025
Incorrect variance computation in Regression Trees (optional) Advanced Learning Algorithms week-module-4	6	47	August 2, 2025
Variance calculation for sample Probability & Statistics for Machine Learning &... week-module-3	1	326	February 6, 2024
Variance calculation Linear Algebra for Machine Learning and Data Sc... week-module-4	1	33	February 2, 2025
Question about computing variance for normalization Improving Deep Neural Networks: Hyperparameter tun coursera-platform	4	583	June 24, 2021

Finding variance in decision tree leaf nodes

Related topics