Predictions using XGBoost

cajumago · August 4, 2022, 3:39am

My understanding is that when we’re using XGBoost (in this case Regression Trees), every time that we have new features (X_new) on which we need to make predictions, and once we have made those predictions. Should we make the iterative loop of ML development? What I mean by that is taking those new examples as a result of the predictions and including them into the data set, separating that data set into training and validation sets (X_train, X_valid, y_train, y_valid), and training the model again for the next predictions.

Is that approach correct?

rmwkwok · August 4, 2022, 4:16am

Are you asking about new features or new samples?

cajumago · August 4, 2022, 4:25am

You’re right, I mean new samples.

rmwkwok · August 4, 2022, 4:32am

Alright, if the performance of your prediction on the new samples is not degraded, you may not need to retrain your model, but if the performance degrades, then you may need to start with an error analysis and the ML development cycle afterwards. Does it make sense?

cajumago · August 4, 2022, 4:35am

Does degrades occurs when Mean Absolute Error is greater than the initial one?

rmwkwok · August 4, 2022, 4:39am

When you train your model, you should have defined a metrics that measure the goodness of your model, and you probably should use the same metric instead of just any one to determine whether the model prediction is degraded or not.

If MAE is what you used as the metric when training, then yes. If not, then please change to use that metric. A good model does not mean any possible metric anyone in the world can think of is at its best value, which means that your model can be good at metric 1 but not that good at metric 2. So we really need to stick with the metric we decided for our model in the first place.

Topic		Replies	Views
How to evaluate a ML implemented model AI Discussions	4	112	December 3, 2021
Difference in Metric Evaluation of Regression and classification? AI Discussions	6	58	July 23, 2022
Evaluation of models Advanced Learning Algorithms week-3	2	468	February 14, 2023
Boosting algorithm Advanced Learning Algorithms week-4	4	424	July 28, 2023
Week3 lab 6 Supervised ML: Regression and Classification week-3	23	80	April 9, 2025

Predictions using XGBoost

Related topics