C1_W3: The Porblem with beating HLP as a proof of ML superiority

Dear all,

I´d like to understand the case explained at the end of the slide 26 and all the 27 of the week 3. ¿could you supply me any more documentation about it?

I can´t understand it even with the transcription.

I´d be very grateful to you.

Best regards.

The lecture is rock solid on this one. I’ll give it a shot:

Slide 26:
This slide suggests other uses of human level performance.

  1. In academia, if your research performs better than HLP, odds of getting published is high.
  2. Say you have a client who wants to get a model with, say, 95% accuracy on a task. If HLP for that task is around 60%, then, use HLP as a baseline to negotiate and establish a much more realistic target.
  3. In business, it’s possible to create a buzz by stating that your product beats HLP by a certain margin. While it’s tempting to do this, care should be taken when establishing this win. More than often, it’s hard for a product to beat HLP. So, when possible, try staying away from this claim and if you do decide to claim that your product is better than HLP, gather sufficient evidence to back your claim.

Slide 27:
Consider a speech to text system. Let’s say that 70% of the labelers provide the first transcript and the remaining 30% of them provide the 2nd version of the text. In theory, both versions of the text mean the same despite the text being slightly different.
Probability that 2 random labelers agree = {.7}^2 + {.3}^2 = .58. This can be used as HLP.
When model is trained, it’s likely going to pay more attention to the 1st version since this version is present 70% of the time. So, model performance will be around 70%.

This goes back to the business use case. On paper, the model seems to have beaten HLP although logically, the model hasn’t. Inconsistency in label has given false impression about model performance.