Human Level Performance, how to set it?

Hello everyone,

We’ve learned during the lecture that the Human Level Performance is a proxy for the Optimal Bayes Error. I was wondering how to actually set this Human Level Performance in practice, given that we typically only have a labelled dataset. Do we actually call people to come and label the set of example data, so we can measure how much error they make? (given the millions of data this seems to be not practicable). How do we do please?

really appreciate if someone could give some insights.

Thanks

If memory serves, I’m pretty sure Prof Ng discusses this in some detail in the lectures. It is as you say: you need real data from human performance. That is not always easy or cheap to obtain, but then neither are labels easy to obtain. In the absence of that, you just have to guess. Also as Prof Ng discusses, there are multiple levels of Human Performance: the general untrained person, trained experts and then groups of trained experts in order of increasing performance. And all these are by definition greater than or equal to the Bayes Error. The problem is that Bayes Error is not directly calculable in most cases, which is why we just use Human Error as an upper bound for Bayes Error.

It is great to read your insightful thoughts about this. I can see that this is very time consuming and maybe also costly I term of money to actually get real data labeling from humans. I just imagined a simple scenario in which one has to label 1 million data points (to measure the human performance and thus human error), and have to be labelling 1000 data points per day. It will take over three months (1000 days actually) to complete the task. This is full time consultancy task actually.

It is beyond the scope of the DLS series to study how data is collected and prepared for use in training and evaluating DL models (or any kind of ML models). That is a separate area generally called “Data Science”. There is another specialization here from DeepLearning.ai called the Practical Data Science specialization. I have not personally taken that yet, but I did take the Johns Hopkins Data Science series on Coursera a few years back. If you are curious about these issues, take a look at the PDS specialization and check out the syllabus.

1 Like