Data labelling (creating label)

The insutructor mentioned "you are taking the features themselves from the

inference requests that your model is getting.

The predictions that your model is being asked

to make and the features that are provided for that.

You get labels for those inference requests by

monitoring systems and using

the feedback from those systems to label that data."

I don’t understand how we are getting labels. It confuses me total.

Hi @aamir

could you be so kind to tell us which week and which lecture is this from?

week 1 labelling data topic of course 2.

Hi @aamir ok thanks.

I’ll try to do my best to be more clear.

Let’s talk about Supervised Learning and labels. In Supervised Learning to train your model, you need to provide, in the training and validation dataset, both the input for the model (the X) and the expected result from the model for that input (labels, the Y).
We must be aware that labels couldn’t be perfect. It can happen especially if the labels have been provided by humans. Even the most respected subject matter expert (the great radiologist…) can make mistakes. We hope that the number of mistakes in the labels is very small. We can improve the accuracy of the model by reviewing which cases of the train and validation set the model predicts wrongly and maybe discover that is the label wrong and correct it and retrain.
In addition, we should monitor the model, monitor the level of accuracy of the predictions from the model, during the inference phase and, again, check with human intervention if some predictions are wrong (in inference phase obviously you don’t have the ground truth). Then, you again can retrain and use the extended training set (old +è new data) to improve model accuracy.
As Andrew has pointed out in Course One, using this Data Centered approach one can get bigger improvements than simply doing hyper-parameter tuning.