I will like to improve the decision tree’s generalization performance.
Based on the graph, what is the optimal max depth should I choose that avoids both overfitting and underfitting?