Adding new feature to a pretrained model

mouhandalkadri · September 18, 2022, 6:23am

Hello
say we have a trained model on a structured data, and this model is already deployed on production environment. at some point we decided to add new features to our data, that only available to the newly created data, ( for example we started asking the users about their date of birth). the question is: what is the best practice to handle this type of changes:

do we have to retrain the whole model from scratch, as we do if we have completely new data ? or is there some way to just update our old model ?
also in regard of our old data ? how should we replace the missing feature in a way doesn’t hurt the model performance ? (the date of birth in the example )

Thank you

balaji.ambresh · September 18, 2022, 7:33am

There is no way to directly change an existing model to make use of a new feature. You can combine outputs of multiple models to make a prediction. For instance, assume that you were creating a model to approve a bank loan:

Old model outputs prediction, P1
New model outputs prediction, P2
Combine P1 and P2 to produce the final outcome, say, P3. For the sake of simplicity, this could be an average of of P1 + P2.

When asserting that generated featues are good ones, you’ve built a model that outperforms the existing model either in storage / compute / performance domains.

There are 2 more choices to make if your API supports only new data moving forward:

Chunk all old data and use only new data to create a fresh model and replace the existing model. This means that the number of new datapoints could be far fewer when compared to existing data.
Provide defaults for old data points for new features (say, use the most frequently occurring date of birth value) and build a new model with all data.

In the event that your API needs to support calls with and without the new feature, you have to address the option of having both models and invoke the appropriate one based on the type of data.

Topic		Replies	Views
Question: week 1, steps of an ML project -2.30 min Machine Learning in Production	2	589	May 17, 2021
Training strategy as more real data becomes available Structuring Machine Learning Projects coursera-platform	3	592	May 13, 2021
Model Evaluation - are we really changing model? Advanced Learning Algorithms week-module-3	7	299	October 23, 2023
Adding features Machine Learning in Production	4	578	January 3, 2022
Training Data Ideal Approach for Transfer Learning Convolutional Neural Networks in TensorFlow week-module-3	2	518	January 21, 2023

Adding new feature to a pretrained model

Related topics