I need more explanation of Data centric?


Please i need more explanation on different between Data centric and Data augmentation

Please specify the lecture & timestamp of what you need help with.

Hi there,

the term data-centric is often use as an alternative to a model-centric paradigm, meaning that with focussing on high quality data (e.g. by data monitoring and cleaning functions) which serve as input as well a training data to an AI system, you often have a stronger leverage compared to the options whether to chose model A or model B, see also this video from Prof. Ng: Andrew Ng "The Data-Centric AI Approach" - YouTube

I can fully echo this view from a practitioners perspective.

On the other hand data augmentation refers rather how to augment your training data, e.g. via:

  • adding noise on your labels
  • applying rotational or shifting operations on pictures in a CV application
  • transfer learning, see also this thread

The purpose of data augmentation is to increase the label quantity and therefore get a more reliable and robust machine learning model and potentially close gaps in the training space so that the model generalizes better.

@Austinkayd: please let me know if this answers your question.

Have a good one!

Best regards