Deep Learning Specialization → Course 3 → Week 2 → Cleaning Up Incorrectly Labeled Data.
I have a question regarding the slide attached.
I dont clearly understand the second point. Could you please give me a practical example and how to do that and why??
Here is the annotation of the section that I am referring to:
" Second, I would urge you to consider examining examples your algorithm got right as well as ones it got wrong. It is easy to look at the examples your algorithm got wrong and just see if any of those need to be fixed. But it’s possible that there are some examples that you haven’t got right, that should also be fixed. And if you only fix ones that your algorithms got wrong, you end up with more bias estimates of the error of your algorithm. It gives your algorithm a little bit of an unfair advantage. If you just try to double check what it got wrong but you don’t also double check what it got right because it might have gotten something right, that it was just lucky on fixing the label would cause it to go from being right to being wrong, on that example. The second bullet isn’t always easy to do, so it’s not always done. The reason it’s not always done is because if you classifier’s very accurate, then it’s getting fewer things wrong than right. So if your classifier has 98% accuracy, then it’s getting 2% of things wrong and 98% of things right. So it’s much easier to examine and validate the labels on 2% of the data and it takes much longer to validate labels on 98% of the data, so this isn’t always done. That’s just something to consider."
There is a lot of subtlety in many of the issues that Prof Ng covers in Course 3. The point of this section is trying to evaluate the effect of incorrectly labeled data on the performance of your algorithm. His point is that most people concentrate on cases in which the algorithm’s prediction does not agree with the label and the label is wrong. It’s easier to find cases like that, because there are fewer instances where the prediction disagrees with the label. And in that case, you could almost argue that the incorrect label actually does no harm, because there must have been enough valid data for the algorithm to correctly recognize the fact that the label was wrong. In other words the algorithm wasn’t fooled by the bad label.
The harder case is that your algorithm agrees with the label and the label is still wrong. That’s actually the more dangerous case, you could argue. But it’s also statistically harder to find just because the population of samples with “correct” predictions is so much larger.
It’s been a while since I listened to these lectures, so I don’t have all the details ready at hand, but I think I remember him recommending doing this type of error analysis only on a small subset of the data. Of course it’s impractical to do it at large scale in any case. The question is how likely is it that you’ll find some “actionable” pattern of errors in your labeling.
Thank you very much for your prompt reply.
It was very helpful.
I want to rephrase what I understood in simpler words, I would appreciate if you let me know if I got it wrong.
It means that for error analysis we should not only tabulate the misclassified cases and assess which one of those were due to incorrect labeling but also do that for correctly classified cases and check if there were significant incorrect labeling in the correctly classified cases.