Fixing Decision Tree

I will like to improve the decision tree’s generalization performance.

Based on the graph, what is the optimal max depth should I choose that avoids both overfitting and underfitting?