Data labelling (creating label)

aamir · July 7, 2021, 12:40pm

The insutructor mentioned "you are taking the features themselves from the

inference requests that your model is getting.

The predictions that your model is being asked

to make and the features that are provided for that.

You get labels for those inference requests by

monitoring systems and using

the feedback from those systems to label that data."

I don’t understand how we are getting labels. It confuses me total.

luigisaetta · July 7, 2021, 2:36pm

Hi @aamir

could you be so kind to tell us which week and which lecture is this from?

aamir · July 11, 2021, 6:52pm

week 1 labelling data topic of course 2.

luigisaetta · July 13, 2021, 10:26am

Hi @aamir ok thanks.

I’ll try to do my best to be more clear.

Let’s talk about Supervised Learning and labels. In Supervised Learning to train your model, you need to provide, in the training and validation dataset, both the input for the model (the X) and the expected result from the model for that input (labels, the Y).
We must be aware that labels couldn’t be perfect. It can happen especially if the labels have been provided by humans. Even the most respected subject matter expert (the great radiologist…) can make mistakes. We hope that the number of mistakes in the labels is very small. We can improve the accuracy of the model by reviewing which cases of the train and validation set the model predicts wrongly and maybe discover that is the label wrong and correct it and retrain.
In addition, we should monitor the model, monitor the level of accuracy of the predictions from the model, during the inference phase and, again, check with human intervention if some predictions are wrong (in inference phase obviously you don’t have the ground truth). Then, you again can retrain and use the extended training set (old +è new data) to improve model accuracy.
As Andrew has pointed out in Course One, using this Data Centered approach one can get bigger improvements than simply doing hyper-parameter tuning.

Topic		Replies	Views
Detail about semi-supervised labeling Machine Learning Data Lifecycle in Production	18	532	December 7, 2023
Process feedback advantages Machine Learning Data Lifecycle in Production	6	440	July 19, 2023
Course2: week1: Data and Concept Change in Production ML Machine Learning Data Lifecycle in Production	3	579	May 20, 2021
Extra info about week2 - Cleaning up incorrectly label data Structuring Machine Learning Projects	5	541	December 29, 2022
Direct labeling: continuous creation of training dataset Machine Learning Data Lifecycle in Production	3	484	September 19, 2023

Data labelling (creating label)

Related topics