Overfit in boosted trees

Thala · July 17, 2022, 5:07pm

Incase of boosted trees,
If we create a new tree to address the misclassifications caused by the other (previous) trees, won’t we run into the problem of overfitting?

Elemento · July 17, 2022, 5:18pm

Hey @Thala,

Indeed if you keep on creating new decision trees which keep on addressing the misclassified samples of the previously trained decision trees, we will eventually overfit, but that’s where the humongous number of hyper-parameters come in. You can hyper-tune these parameters for addressing the issue of over-fitting.

For instance, take a look at the documentation of the XGBRegressor offered by the XGBoost library. You will find that there are more than 20 hyper-parameters, some of which are already set to default values which are found to be the best values for most of the scenarios after running 100’s of experiments, and some of them are left for you to hyper-tune in accordance with the task at your hand. I hope this helps.

Regards,
Elemento

Thala · July 17, 2022, 5:29pm

Ok
I think the Boosted trees concept discussed in the course was merely introductory level.
So the concept of hyper-parameters etc were not covered.
It will be very helpful if you can share with me some other materials/videos which discusses boosted trees in a greater depth

rmwkwok · July 18, 2022, 2:57am

Hello @Thala,

Generally speaking, we can either (1) limit the growth of any tree, or (2) limit the number of trees, for dealing with overfitting. Certainly we can do both.

To achieve (1), for example, we can limit the data visible to a tree - this idea is covered in video; (2) we ask for a minimum gain before allowing a split - we calculated gain in C2 W4 assignment; (3) we hard-code a maximum number of split or/and a maximum tree depth.

To achieve (2), for example, again, we can hard-code the maximum number of trees; we can use early-stopping to ask the tree stop growing when a certain condition is met.

I hope the general idea in above isn’t difficult to get through, but for the details such as what hyper-parameters to tune, you can start reading from here , which will give you some hyperparameters’ names, so that you can further read about their definitions here. I also sometimes read this page again to refresh my memory about what options I have.

Lastly, if you want to know more about a specific hyperparameter, googling the name or “xgboost {the name}”. If you want to read others’ sharing on how to tune hyperparameters, I think you can google a lot of articles online.

Cheers,
Raymond

Thala · July 18, 2022, 3:31am

Similar to Random forests, we can use a subset of features to split at a node. right?

rmwkwok · July 18, 2022, 3:41am

Yes we can do that.

Raymond

Topic		Replies	Views
Is Regularization Effective At Reducing Overfitting in Decision Trees (xgboost)? Advanced Learning Algorithms week-4	1	252	February 27, 2024
Do limiting depth of decision tree prevent overfitting? Advanced Learning Algorithms week-3	4	455	December 10, 2023
Question about XGBoost algorithm Advanced Learning Algorithms week-4	3	565	September 12, 2022
Can decision tree algo used for regression? Advanced Learning Algorithms week-4	10	799	February 5, 2023
Overfit/overtune the dev set? Structuring Machine Learning Projects	2	619	May 23, 2022

Overfit in boosted trees

Related topics