Do limiting depth of decision tree prevent overfitting?

francesco4203 · December 10, 2023, 4:36pm

Hi everyone,
While building a decision tree, we might need to stop splitting the tree if the information gain is no longer significant, or if we reach some pre-defined limit of the depth. This clearly save the time of processing on attributes that are not that important.
However, I think that it also helps prevent something like overfitting. The intuition is that, if we split all until the end (that is, there is no sample or all samples are in the same class), the tree might be too specific corresponding to the training dataset, which is quite the same to the term overfitting in linear regression, that is the tree does not generalize well on real world dataset. Therefore, it is not always good to split all the way down to the leaves.
Do I understand it correctly?
Thank you.

TMosh · December 10, 2023, 7:17pm

Yes, I think you understand it correctly.

Christian_Simonis · December 10, 2023, 7:30pm

Hi @francesco4203

this thread might be relevant for you since it also touches upon decision trees:

https://community.deeplearning.ai/t/can-decision-tree-algo-used-for-regression/268415/8

Best regards
Christian

Christian_Simonis · December 10, 2023, 7:38pm

@francesco4203:I believe in general you described the concept well, but I am not sure what you mean concerning linear regression:

In general, a linear regression model is rather a quite simple model, e.g. it can work even with one weight and one bias if you only use one feature. Therefore, it’s usually not associated with overfitting.

Of course you have a point if someone would add lots(!!) of features in a linear regression model that go way beyond tha capacity of the data. But just to understand your statement correctly: May I ask what you associate here with overfitting or what you are referring to?

Thanks!

Best regards
Christian

rmwkwok · December 10, 2023, 11:19pm

@francesco4203, you have very well explained how a decision tree overfits to the training data.

Cheers!
Raymond

Topic		Replies	Views
Is Regularization Effective At Reducing Overfitting in Decision Trees (xgboost)? Advanced Learning Algorithms week-4	1	252	February 27, 2024
Overfit in boosted trees Advanced Learning Algorithms week-4	5	507	July 18, 2022
Are decision trees always binary trees? Advanced Learning Algorithms week-4	1	662	February 5, 2023
Can decision tree algo used for regression? Advanced Learning Algorithms week-4	10	800	February 5, 2023
Decision tree stopping criteria Advanced Learning Algorithms week-3	3	318	November 5, 2023

Do limiting depth of decision tree prevent overfitting?

Related topics