Question: week 1, steps of an ML project -2.30 min

optimizing_wieghts · May 16, 2021, 3:16pm

As you mentioned if the data drift happens during the inference period. you said you would update the model, how will you update the model?
do you mean you will combine the new data which caused the data drift along with the already trained data? and perform new feature engineering. okay if it so, the new data will not have the target column right? how will you consider that? will it be good ground truth data?

correct me if I am wrong.

rajgupt · May 16, 2021, 6:29pm

Hi @optimizing_wieghts,

I can relate to your question. Combining both datasets makes sense if old data distribution is still relevant to the problem statement. For e.g. speech recognition still required for adult voices in addition to young voices and yes new dataset has to be labeled before it is used in retraining of the model. there could be smarter ways to label some of those but those will be application dependent.

optimizing_wieghts · May 17, 2021, 12:48pm

@rajgupt thanks very much for your quick turnaround. would you mind give me an example scenario with structured data? say we are going to predict the house price. I have trained the model and deployed it into the production environment. Assume we are doing batch prediction in the weekly interval, then before retraining the model, I do verify the distribution of the model that trained with train data target column and predicted column from the new data. If I notice a significant difference in the distribution, I do prefer doing retraining. But, the question is here, whatever we have predicted it’s not an accurate prediction. so how we can consider that data to retrain the model?

Actually, I have confused you with my approach. could you please let me know your approach to update the model? what is the smarter way to label?

Topic		Replies	Views
Training strategy as more real data becomes available Structuring Machine Learning Projects coursera-platform	3	592	May 13, 2021
Adding new feature to a pretrained model Machine Learning in Production	1	652	September 18, 2022
Automatic Training For Forecasting AI Discussions ai-discussions , data-centric	3	72	October 17, 2021
How to evaluate a ML implemented model AI Discussions	4	114	December 3, 2021
Refine or train from scratch after updating dataset AI Discussions ai-discussions , data-centric	1	43	May 18, 2023

Question: week 1, steps of an ML project -2.30 min

Related topics