How to evaluate a ML implemented model


The issue is, I have implemented my ML model (XGBoost) for binary classification, this model have aprox. one year of use, but I see that the performance actually isn’t the expected, in the first months it worked very well but today not so much. I want to measure the performance, for have a support and bring a initiative of improve the model or in the worst case, change the model, but I don’t know what is the best way to measure the performance, could help me?, thanks.


ps: The model is used to detect the claims of a car insurance.

Hi Daniel,

Models start to deteriorate from the moment they are deployed to production. Real world data keeps evolving and soon a trained model loses it’s usefulness.

One of the first things to do when deploying a model to production is to check for ‘data drift’ in the inputs. Is the data distribution of the new data (characteristics like data mean, std deviation) changing, or has changed significantly than the data used during Training? If yes, the model is bound to under-perform and we definitely need a re-training here.

Just like monitoring the inputs, we should also monitor the outputs of the model. For example, in general 10% of the applicants were denied insurance claim in the training dataset. Is the model prediction when deployed also following similar statistics? Still 10% of the applicant are denied claims or is now 20% of the claim denied once the model is in service for an year. We should monitor these metrics and flag.
Do note though, just because the model is denying more claims doesn’t really mean that the model is poorly performing. It could just mean that current data is genuinely like that. Nevertheless, it’s something to look into.

Also, you mentioned that initially when the model was deployed, the performance was excellent but later, after an year, it has degraded. So you already have a measure of effectiveness (loss of revenue, loss of customers to competitors, more complaints etc.), see if it can be incorporated into the ML lifecycle. One way this could influence is that this may help identify a new training set to train the model with. Double check the data from the complaints and see if they definitely were valid claims rejected by the model. If yes, then they should be added to the new training set.


Wow SomeshChatterjee is clear, very thanks.

In the moment that I decided to add new information to the model, what is the best technique to do this, because in this moment I just thinking in create a new model with this new information, but it is a slow process, I would appreciate if you can guide me for more efficiency.


Hi Daniel,

I think what you are asking about is ‘incremental learning’, i.e. take an existing trained machine learning model and then add more training examples to it so that the existing model can adapt to the new data.

If that’s the case, XGBoost has the provision for that, check out the xgb_model parameter in the fit method. If you specify the instance of an existing trained model to it, then it’ll update the existing model. You can check some examples from here.

Other ML techniques in sklearn usually have a ‘warm_start’ parameter, where you can specify whether you want incremental learning (for e.g. check out GradientBoostingClassifier). In deep learning, you simply take the existing model and train it again using the new data, it’ll continue to update the weights based on the new data.

Do note, there might be some limitations of using incremental learning with XGBoost, like mentioned here.

Let me know if this helped or if there’s something else.


1 Like

Additionally, might there be some ideas from the area of fine tuning a deep learning model that could be applied here? Depends perhaps on exactly what OP means by new information.