What is the best way to find a balance between a data-centric and a model-centric approach?

srik · August 11, 2021, 6:17pm

yurij · August 14, 2021, 7:52am

Hi, @srik, and welcome to the community!

TL;DR: data-centric first, model-centric second. Switch your attention to the model only when you exhausted all your options to improve the data.

This is a discussable topic. In my opinion, you should always start by picking a proven and well-known model suitable for your task, training it on the data you have, and using the results you had as the baseline for all of your future work. If you have to pick between multiple models at that stage, pick the simpler one. Simpler models are easier to train, and this will allow you to iterate faster. Also, we tend to overestimate the complexity of the model we need for the task.

Then, focus on improving your data. Go full-on data-centric approach, and don’t forget to evaluate your model after every improvement and compare the result with the baseline.

At some point, you’ll notice that the model isn’t improving anymore and/or that you’ve exhausted all your options for improving the data. If, at this point, you are not happy with the performance of your model, switch your attention to the model. You may, for example, pick a more complex model and evaluate it or look into fine-tuning the architecture manually. But, don’t spend too much time on it; if you still have options to improve your data, do that with a new model and see if it helps.

Topic		Replies	Views
Data-centric modelling AI Discussions ai-discussions , data-centric	1	88	May 18, 2023
Difference between data centric and model centric? AI Discussions ai-discussions , data-centric	1	138	August 13, 2021
My first Data-Centric AI based Publication AI Discussions ai-discussions , data-centric	8	166	March 24, 2023
Best principles for good Data Centric approach? AI Discussions ai-discussions , data-centric	1	63	May 18, 2023
What types of model or dataset evaluations have you found to be most valuable for identifying data-centric improvement opportunities? AI Discussions ai-discussions , data-centric	1	51	May 18, 2023

What is the best way to find a balance between a data-centric and a model-centric approach?

Related topics