Incorrectly labeled data


I have a question, what is difference between “Incorrectly labeled” and “other causes”?
For example, In "In correctly labeled " column in lecture, classifier made a mistake because the picture was drawing of a cat , not a real cat, so this is in “Incorrectly labeled” column but I guess we can make new section of “Drawing” in “other causes” so I’m counfused.
please help me , best regards

Hi @Koyo_Fujii ,

Lets start with this: Error Analysis is a debugging process to find out why the model has miss-classified some samples, so that remedies can be applied.

In the process of debugging the errors manually, there can be many causes. One of these causes is ‘incorrectly labeled’. Other can be ‘blurry image’. Other can be ‘the classes overlap and there is no clear line of demarcation’, and so on.

Your question is “what is the difference between ‘incorrectly labeled’ and ‘other causes’”: These are the same in the sense that both groups are causes of error in the classification, but the ‘incorrectly labeled’ is a type of error that is common, manageable, and many times the most prevalent, and that’s why it is interesting to keep it discriminated from the ‘Other causes’.

Now, for the ‘Other causes’ it is important to write down each one, as there may be another prevalent pattern that needs to be address.

At the end of the day what we want when doing Error Analysis is to identify the most prevalent causes and act upon them to improve the model.

Hope this sheds light on your question.

I got it, thank you so much. So am I correct in understanding that we separate “Incorrectly labeled” and " other causes" just because?

Well, yes, I guess you can say “just because” in the sense that there is not a specific rule or definition to do that. As shared before, ‘Incorrectly labeled’ is a rather common cause of errors, and it is practical in the error analysis to keep it apart from the rest of the errors. As you develop your own models in your specific domain, you might learn to identify another common cause of errors and may be you end up discriminating another type of error from the ‘other causes’.