Why incorrect that Bayes error is higher?

someone555777 · May 12, 2023, 6:55pm

?

paulinpaloalto · May 12, 2023, 8:20pm

These questions are all pretty subtle and require some careful thought. In this example, we see that the training distribution includes both the forward facing data (from the same distribution as the dev/test data) and the Internet data (a different distribution). But the model does worse (higher error rate) on the training distribution plus the dev/test distribution than it does on just the dev/test distribution. Usually you expect the model to do better on the training data than on the dev/test data (at least mild overfitting) or in the perfect case to be equal on the training data to the dev/test data. So if it does better on the dev/test data, that suggests that the dev/test data is easier for the trained algorithm to correctly identify. So at the very least, you would conclude that the Bayes Error is most likely equal to or lower on the dev/test data than it is on the full training distribution.

It doesn’t prove that of course, but it would not be likely that the dev/test data is “harder” (has a higher Bayes Error) than the full distribution, given the behavior posed in the question.

The semantics are pretty subtle here. If what I said above does not convince you, then it might be worth listening to the relevant lectures again.

someone555777 · August 17, 2023, 5:34pm

ok, thx. I have just not very understand how can we conclude about bayes errors by errors in different splits. I thought that Bayes error is something like a constant for all data. But, ok, I’ve understood from your answer, that we can approximate it because dev and test data came from an another source.

Topic		Replies	Views
Course 3, Week 2, Assignment, Bayes error Structuring Machine Learning Projects coursera-platform	14	747	October 27, 2022
How Bayes error explains lower error rate based on dev/test sets? Structuring Machine Learning Projects coursera-platform	6	806	July 4, 2022
Need help understanding bayes error Structuring Machine Learning Projects week-module-2 , coursera-platform	1	619	March 23, 2024
Assignment 2 Q7 Structuring Machine Learning Projects coursera-platform	7	903	February 7, 2024
A doubt for Quiz 2 Question 7 Structuring Machine Learning Projects coursera-platform	3	578	May 5, 2022

Why incorrect that Bayes error is higher?

Related topics