How and why do training and cross validations sets wear out in time?

Christian_Simonis · January 28, 2023, 7:55pm

In addition to the previous answer: According to CRISP-DM, eventually we want to deploy our model and operate it so solve our business problem.
As @Muhammad_John_Abbas mentioned correctly, the performance of the model on test data (which was never seen before) is your litmus test that has to be successful to have sufficient evidence your machine learning pipeline performs well enough on new data. If OK you can go for a deployment, e.g. in the cloud.

In reality you also have:

distribution shifts of data (think of traffic / remote work during the pandemic situation in 2020)
IoT swarm intelligence applications which are designed to get better and more powerful over time as self learning systems like this one

This means in reality you have to deploy often, e.g. after a new training where you incorporated new knowledge into your model.

Also in this situation you need to conduct your litmus test with test data which were never seen before, before deployment and operations. (A test data set from a previous litmus test would be an issue since this information already made it somehow into the previous process and is actually not „new“).

I hope this view helps, @mehmet_baki_deniz. Have a good one!

Best regards
Christian

Topic		Replies	Views
Test set and Validation set Advanced Learning Algorithms week-3	10	522	January 15, 2023
Retrain model on the whole data set (include test set) when deploying a model Advanced Learning Algorithms week-3	2	294	November 6, 2023
Cross validation sets Advanced Learning Algorithms week-3	4	424	July 16, 2023
Why do we need to have a validation set for training? Advanced Learning Algorithms week-3	17	902	February 8, 2023
Why do a training set? Structuring Machine Learning Projects	1	516	August 1, 2022

How and why do training and cross validations sets wear out in time?

Related topics