Deciding which models to retrain in a pipeline of chained models

In real world, multiple models tend to be stringed together to perform end user tasks. So perf metrics is typically user based and end to end, as opposed to individual model output. How should we decide which model to retrain to achieve best performance stability over time in production.

Hello jax79sg and welcome to the forum,

I would consider two possibilities, because AI system = code + data.

  • Model-centric - how can you change the model code to improve performance.
  • Data-centric - how can you systematically change data to improve performance.

You should monitor performance on production environment ofcourse, so you can make decisions.

Hi, thanks, this reply doesn’t really seem to answer the question as i thought its more complex than that.

I have the same question as @jax79sg and oftentimes in complex system of systems, multiple models can be inherently chained up together, and very quickly their errors get amplified / propagated down the line.

@tomstryja (or any other mentor) could you share some examples on how we can tackle such situation? This is especially tricky if multiple teams work on different models but when chained up together, the system become somewhat unusable due to the large overall error.