Let’s talk about Supervised Learning and labels. In Supervised Learning to train your model, you need to provide, in the training and validation dataset, both the input for the model (the X) and the expected result from the model for that input (labels, the Y).
We must be aware that labels couldn’t be perfect. It can happen especially if the labels have been provided by humans. Even the most respected subject matter expert (the great radiologist…) can make mistakes. We hope that the number of mistakes in the labels is very small. We can improve the accuracy of the model by reviewing which cases of the train and validation set the model predicts wrongly and maybe discover that is the label wrong and correct it and retrain.
In addition, we should monitor the model, monitor the level of accuracy of the predictions from the model, during the inference phase and, again, check with human intervention if some predictions are wrong (in inference phase obviously you don’t have the ground truth). Then, you again can retrain and use the extended training set (old +è new data) to improve model accuracy.
As Andrew has pointed out in Course One, using this Data Centered approach one can get bigger improvements than simply doing hyper-parameter tuning.