When pruning a decision tree, is it better to iterate through every node in the tree, choose the node that on pruning gives the best information gain, and keep running this algo till there is no node that on pruning leads to information gain, or is it better to start with a bottom up approach, by going up from the leaves and pruning a node if it gives >0 information gain.
A common practice is to use a bottom-up pruning approach (post-pruning). Iteratively scanning all nodes for the best prune each time is possible, but it’s usually more expensive and rarely used in practice.
Hope it helps! Feel free to ask if you need further assistance.
Aside from computational complexity, is there any modelling advantage of one method over the other?
In terms of modeling performance, both approaches are trying to achieve the same goal. The main difference is that bottom-up pruning tends to be more stable because it evaluates subtrees that are already complete so the effect of pruning is clearer and less sensitive to local fluctuations in training data.
The “best-node” iterative pruning can occasionally look better on the training set but doesn’t necessarily translate to better generalization, since it’s more greedy and can overfit to small differences.