Boosting algorithms

In Boosting algorithms, I got that we perform sampling with replacement technique. But why not also feature randomization like in random forests?

Also, how do we know that XGBoost may not end with correlated trees like bagging trees(like the same root nodes for most of the trees )?

Hi @bhavanamalla

Great questions! You’re right that XGBoost uses sampling with replacement to create the training data for each tree, but does not randomize features like random forests.

I will try to break it down for you

  • Feature randomization in random forests helps decorrelate the trees so they make different errors. But in boosting, we actually want each tree to focus on correcting errors from earlier trees. So we don’t want to decorrelate them.

  • XGBoost uses regularization techniques like shrinkage and subtree limits to prevent overfitting, so feature randomization is less important.

  • Computing the optimal splits for a tree is much faster when you don’t have to evaluate splits on random subsets of features. This efficiency is key for boosting.

As for tree correlation - it’s true that without randomness, XGBoost could end up with similar tree structures. But it uses some clever tricks to help decorrelate:

  • Column subsampling during split finding - each split examines a subsample of columns.

  • Dropping out-of-bag samples for each tree - each tree trains on a subsample.

  • Regularization helps prevent trees from being too similar.

So in summary, XGBoost drops some sources of randomness to increase focus and efficiency, but uses other smart techniques to prevent trees from being too correlated. The balance helps it boost effectively!

I hope this helps!

1 Like

Thanks for your clear explanation!