[C1W3] Consistent labeling and HLP

Giannis_Antoniadis · November 26, 2021, 6:21pm

In week 3’s lectures, HLP and the problem of label consistency are explained in detail and some examples are provided where it’s suggested that the labelers conform to a common set of rules when providing labels. For example, in the case of identifying phone defects, a rule that could resolve ambiguity would be to label any scratch larger than 0.3mm as a defect. This kind of method is expected to raise HLP and improve the ML system performance as well. But doesn’t this method essentially impute a/the learning rule we would otherwise want the learning algorithm to find into the dataset definition? In other words, when creating deterministic rules of that form, isn’t it natural for the learning algorithm to more easily catch up to (and probably just verify/reinforce) these rules?

reinoudbosch · May 18, 2022, 2:10am

Hi Giannis_Antoniadis,

It is precisely the problem that the ML system does not learn such a rule well when the labels that are fed during training are inconsistent. So, for example, the ML system may not be able to correctly predict scratches of 0.35 mm as indicating a defect if some labelers do not label it as such. The incorrect labelling distorts the calibration of the ML system.

Topic		Replies	Views
Course 3 Week 2 - Cleaning Up Incorrectly Labeled Data Structuring Machine Learning Projects	1	524	October 7, 2022
C1_W3: The Porblem with beating HLP as a proof of ML superiority Machine Learning in Production	1	539	July 29, 2022
Week 3: Improving HLP for the tasks where it is hard to create labeling instructions Machine Learning in Production	1	572	June 15, 2021
Course 1- week 3 - label consistency: unintelligible tag Machine Learning in Production	1	588	May 19, 2021
Cleaning Up Incorrectly Labeled Data - ML Strategy \| Coursera Structuring Machine Learning Projects week-2	4	232	April 11, 2024

[C1W3] Consistent labeling and HLP

Related topics