Deciding when to use a Decision Tree Model

Hi! I’ve been reading some further material around Tree based models after the first 2 videos: When and Why Tree-Based Models (Often) Outperform Neural Networks | by Andre Ye | Towards Data Science

Curious to understand
a) Is there a general rule of thumb as to when one should use a decision tree vs. another model e.g. Neural Network?
b) Would it be advised to implement both models and compare training sets?
c) If yes - then is there any value in amalgamating the two models e.g. A Decision Tree that calls on a Neural network to classify?

It would be great to have some further examples of ML use cases and the types of models chosen and why.

Cheers,
Luke.

a) No.
b) Yes.
c) It depends on how you define “value”. Try it and see.

1 Like

Hi @Luke_Rogers ,
I don’t think there is a rule of thumb that you can always apply.

It depends on the type and volume of data and also on the business problem.

  • When it comes to types of data :
    You should use decision trees with tabular/ structured data and Neural Networks with unstructured data (images, videos, audio etc)
  • When it comes to volume : You should use Neural Networks when you have a large amount of data
  • When it comes to type of business problems :
    If interpretability is more important than performance you should use Decision Trees on the contrary if performance is more important than interpretability Neural Networks might be better

Hope this helps :slight_smile:
Dalila

3 Likes

Hi @Luke_Rogers,

I agree that there is no general rule of thumb for that, especially when you can combine two models - no matter how you combine them. A rule of thumb is to evaluate your work by a cv set to figure out what is the best model, or what is the best way to combine the models.

The above is my answers and I am sorry that they may not be what you are asking for. I have never seen a Standard Operation Procedure for that either. However, when a tabulated dataset comes to me, I would try Gradient boosted decision trees first; whereas when a image dataset comes to me, I would find a pre-trained image network first. Then I can start my investigation and improvement cycles.

if you have done some work and have some findings to share and discuss, you are welcomed to post them.

Cheers,
Raymond

2 Likes

Thanks All - I really appreciate the feedback.
@rmwkwok As you say I’m essentially using trial and error approach to find optimal models and find that especially while I’m in my infancy of ML this is useful since my assumptions on what would be best aren’t always correct.

Noted the course material did actually go on to explain when to use decision trees (primarily structured data) I was just curious (& impatient).

Cheers,
L.

2 Likes

You are welcome @Luke_Rogers :slight_smile: I appreciate your understanding too.

I think the best part about ML is when we really do something, and it is also the moment we can discuss something further, and more meaningful. I look forward to such discussion in the future :wink:

Raymond

1 Like