Can decision tree algo used for regression?

I would say it can be used but because the value is continuous and there course be \infty leaf nodes so it is often used for classification problem. What do you think?

Hi @tbhaxor

you can use decision trees for regression: sklearn.tree.DecisionTreeRegressor — scikit-learn 1.2.1 documentation

When predicting a continuous variable with a decision tree regressor, you often see this certain typical discretised output. [(Note: regarding your infinity question: you do not want to go for an arbitrary (or infinite depth) also due to the risk of overfitting!].

See here an Adaboost example which is quite nice to start and play around with:


Source

Best regards
Christian

1 Like

Infinite leaf nodes isn’t really a practical design.

1 Like

Also one question, does this condition apply on infinite breadth? Like infinite splits? I see binary splits in the course but curious

Does “infinite” anything sound like it could be implemented?

1 Like

In this context, it means a very (optional) large number

There is always a tradeoff between what’s possible and what is practical.
There is often more than one method to approach any specific problem.
These decisions are guided by experience.

Hi @tbhaxor,

most algorithms to create trees rely on binary splits. I noticed this question is also asked here: Are decision trees always binary trees?
Feel free to follow if this is of interest for you!

If you have a loo large depth or too complex model with too many parameters, you really increase the risk of overfitting which you can see, e.g on your test set performance or residual analysis of train and test set, see also this thread for some info: Didn't understand: (Model selection based on degree of polynomial) Overly optimistic regards to generalization error - #3 by Christian_Simonis

Usually you can regulate the tree structure with the maximum depth, in our previous example it was (Max_depth=4): Can decision tree algo used for regression? - #2 by Christian_Simonis

In addition the model (and how smooth it can follow the labels and overcome the discretisation) can be steered with the numbers of boosting stages n_estimators=300 which allowed a finer discretisation than with only n_estimators=1, see also: Decision Tree Regression with AdaBoost — scikit-learn 1.2.1 documentation

So in summary it’s your architectural decision how complex you design your model. It’s a trade-off between having:

  • sufficient freedom (e.g. tree depth) to learn abstract patterns in the parameters and not to underfit
  • but not having to many parameters (so that besides actual patterns also noise is memorised and factored in by the model which we would be calling overfitting)

I would like to encourage you to play a little with the code example on the adaboost model. You can start with modifying the regularization of your model with the maximum depth. (Afterwards you can also think about pruning, cutting low info branches, see also this article: Understanding Decision Trees. You can relate our decision making… | by Himanshu Birla | The Startup | Medium).

Best regards
Christian

Nicely explained, could you post this under the above mentioned question, I need to mark it solved there

1 Like

You can add a reply that points to this thread, and mark that as the solution.

However there’s no real need to mark threads as solved.

1 Like

Thanks!

Oh - didn‘t realise this question was asked by you.
Anyway, glad I could help. Have a good one! :slightly_smiling_face:

Best regards
Christian