Need help understanding bayes error

krithika_govindaraj · March 22, 2024, 9:22pm

In the lecture, we know that bayes error should be lower than the training error, but is there a separate bayes error for dev/test sets ?

Relates to this question:

Assume you’ve finally chosen the following split between the data:

Dataset:	Contains:	Error of the algorithm:
Training	940,000 images randomly picked from (900,000 internet images + 60,000 car’s front-facing camera images)	2%
Training-Dev	20,000 images randomly picked from (900,000 internet images + 60,000 car’s front-facing camera images)	2.3%
Dev	20,000 images from your car’s front-facing camera	1.3%
Test	20,000 images from the car’s front-facing camera	1.1%

You also know that human-level error on the road sign and traffic signals classification task is around 0.5%. Based on the information given you conclude that the Bayes error for the dev/test distribution is probably higher than for the train distribution. True/False?

paulinpaloalto · March 23, 2024, 12:04am

The definition of the Bayes Error is that it’s the best that you can possibly do on the given set of data. So if the training data and the dev/test data are not from the same statistical distribution, then you have to assume that the Bayes Error on the two distributions are not necessarily the same.

But if the algorithm does worse on the training data than it does on the dev/test data, then that would indicate that the dev/test data is “easier” in some sense, right? Meaning that its Bayes Error is actually lower, not higher. Well, either that or the correct answer to that question is “no” because you just don’t have enough information to conclude anything about the Bayes Error of the dev/test data. In other words, the reasoning is that just because it is apparently easier for the algorithm to see the patterns in the dev/test data doesn’t necessarily tell you that the absolute Bayes Error is lower on that data. But with either interpretation, the correct answer is “no”, right?

Topic		Replies	Views
How Bayes error explains lower error rate based on dev/test sets? Structuring Machine Learning Projects coursera-platform	6	806	July 4, 2022
Why incorrect that Bayes error is higher? Structuring Machine Learning Projects coursera-platform	2	553	August 17, 2023
Course 3, Week 2, Assignment, Bayes error Structuring Machine Learning Projects coursera-platform	14	747	October 27, 2022
Assignment 2 Q7 Structuring Machine Learning Projects coursera-platform	7	903	February 7, 2024
A doubt for Quiz 2 Question 7 Structuring Machine Learning Projects coursera-platform	3	578	May 5, 2022

Need help understanding bayes error

Related topics