Getting labeled data from humans even when ML better than humans

On C3W1 slide 25, “Why compare to human level performance,” one benefit of ML being worse than humans is that you can “get labeled data from humans” in this case. However, I’ve worked with audio data for ML for a few years where you can still “get labeled data from humans” if your ML is better than humans. This is because of the simple fact that for problems like audio denoising, all of the “labeled” training data is generated by humans, even when the generated data need not be perceptible by humans in accordance with its label. Here’s an example.

  • Say we have a speech detection task. The detector should output “1” if it detects human speech in an audio segment and “0” if not.
  • To create training data, humans select audio of a human speaker saying “Hello World” and add it to a ton of noise, with an SNR of -10 dB. The human labeler attaches the label “1” to the audio segment and “0” to the surrounding audio which lacks human speech.
  • A human “labeler” listening to the recording cannot hear “Hello World” because there is so much noise. However, a ML algorithm trained to detect speech can possibly detect the speech even though it is imperceptible to a human listener.

See what we did there? We got “labeled data from humans” that a human could not label after the fact.

Therefore the first bullet point on this slide should be updated to say “Use humans to label existing data.” This way we exclude the very common case where humans generate labeled training data from scratch that does not exist in the wild.

1 Like

Hi am003e,

Great point!
I have made an issue on github suggesting that this be looked into for a future revision of the lecture / slides.