Why does XGBoost not choose the state at it's best round when terminating early?

In week 4’s optional lab titled “Tree Ensemble”, it was mentioned that

  • The model is returned at its last state when training terminated, not its state during the best round. For example, if the model stops at round 26, but the best round was 16, the model’s training state at round 26 is returned, not round 16.
  • Note that this is different from returning the model’s “best” state (from when the evaluation metric was the lowest).

It seems to me that here we are choosing a state where the loss is higher and hence, lower perfomance, instead of the state at round 16 which has a lower loss. Is my understanding correct? And what’s the rationale behind implementing in this way?

Thank you!

1 Like

Link to above post. Somehow, I can’t add it to the post when posting the above, so omitting the coursera website and leaving only the path.

/learn/advanced-learning-algorithms/ungradedLab/kkC4N/optional-lab-tree-ensembles/lab?path=%2Fnotebooks%2FC2_W4_Lab_02_Tree_Ensemble.ipynb

Hello @liangweitan

The model’s final state does not mean our choice. :wink:

If the model stops at round 26, this means the model contains 26 trees, and when we make prediction, we can choose to do it with only the first 16 trees and neglect the last 10 - in this way we are choosing to use the best state.

Check this out for how to do it.

Cheers,

Raymond

@rmwkwok, that URL doesn’t work (“Page doesn’t exist or is private”).

1 Like

Oh! Thank you, Tom! I have updated it.