Hi, the lectures seems to focus on finding problems by monitoring the model performance, which by and large will always require a human in the loop of sorts. I thought one way was to employ some kind of data/concept drift detection on incoming data, which could sound the alert on potential model failure. In a way, this can be largely automated. if this is feasible, what would be the recommended tools for different kinds of data? (E…g Audio, Image, Strucutured…etc).
Hi @jax79sg,
I suggest reading https://paulvanderlaken.com/2020/03/24/ml-model-performance-degradation-production-concept-drift/ and https://towardsdatascience.com/why-machine-learning-models-degrade-in-production-d0f2108e9214. I hope it helps.
Best regards
Thanks @JoshE , the article is model centric on solving issues that is brought about by data drift. For example, it requires the assessment of model with human annotated data. My question plays the idea of a data centric monitoring, detecting drift and alerting humans to double check the outcome. This is useful in scenarios where a very regular model-based checking (with human annotators) is not often feasible in production.
Hi @jax79sg,
I will appreciate knowing which one of the two articles is referenced.
I would like to cite the first article -
“In general, sensible model surveillance combined with a well thought out schedule of model checks is crucial to keeping a production model accurate. Prioritizing checks on the key variables and setting up warnings for when a change has taken place will ensure that you are never caught by a surprise by a change to the environment that robs your model of its efficacy.”
Can’t it be done by data centric monitoring?
Best regards,
Thanks @JoshE , i missed that sentence. So it would be a better practice to;
- Perform Drift detection for unmanned (cheaper) early warning to potential model failure.
- Perform model perf monitoring (usually more expensive due to need to label data).
Is it fair to say that using Drift Detection alone is not a good way to assume model failure, since model sensitivity to drifts can vary and generally hard to quantify?
I suggest that it is model dependent.