Course 3, Week 2, Assignment, Bayes error

The feedback for a question appears to contradict the answer.

Given a higher training-dev set error than dev/test error,
I can see there is a data-mismatch problem and that Bayes error for the data distribution of the Training-Dev set must be higher.

The answer to Q7 appears to contradict this. Yet the feedback agrees with my statement above.

Hello @Evalyne_Muiruri ,
I went through the quiz of week 2 again, but I don’t understand what you mean. Can you please write me a private message with more details?

Hello, I had the same issue with Q7, and in Coursera discussion forum another student got confused as well.
Essentially, in our answers we state that with the figures provided in the example the Bayes error for the Training-Dev distribution is higher than the test/dev distribution. Such answers are then graded wrong but in the explanation we get a statement that confirms our understanding.

Hi Ercaronte,

Welcome to the community!

The question that you are talking about is a follow-up from the previous one. Please look at both the question once again and in case, you still face the same issue, will discuss that here.

Thanks!

Hello,
The answers to Question 7 seem to hinge on the “Bayes error for the dev/test distribution” versus the “train distribution”. From the lectures, I understood that the Bayes error was independent of the training or test set distributions since it was defined as the theoretically optimal error. Accordingly, I selected the only answer that didn’t involve either.
What am I missing here? I don’t understand how the Bayes error can be relative.

Hi, Marshall Mayberry.

The algorithm works better over the distribution set of the provided data that it has trained on, but we cannot say it clearly. In such a case, we need to measure the human level error separately. The optimal error rate is normally called as bayes error, that helps in guiding whether the algorithm will come as per the needed expectation by going through all kinds of crests and troughs (as in bias/variance).

Yes, I will recheck this query with the staff as so many of you have pointed out issues with it.

Hi @Mubsi, could you please re-look into this query (C3 W2 |Quiz section-Q7), as there is some kind of mismatch between the given option and the feedback provided.
If needed, I can post this query on github, as you say. Thanks in advance.

Hi @Rashmi,

Yes, please post it as an issue, thanks.

Cheers,
Mubsi

Hello @Evalyne_Muiruri, @ercaronte and @Marshall_Mayberry.

Kindly provide us the screenshot of this question that you had received while doing the Quiz for this section. It would be a great help in resolving us this query.

Thanks!

Hi, @Rashmi,
Here’s the question regarding “Bayes error wrt dev/test vs train” distributions. I’d be interested in its resolution, just so I know whether I’m still misunderstanding something essential. To that end, I’d like to know what the correct answer is why.

Please share the screenshot of the same question, as there may be variations in the options to every set of quizzes. Thanks.

Here’s the screenshot:

Okay, thanks! @Marshall_Mayberry.

@Rashmi It looks like in the recent version of this question there is a mistake as well: The question is “Based on the information given you conclude that the Bayes error for the dev/test distribution is higher than for the train distribution.”, but the explanation says “Since the training-dev error is higher than the dev and test errors probably the dev/test distribution is ‘easier’ than the training distribution.”

If the dev/test is easier, then I believe the Bayes error should be lower (not higher)

See the screenshot attached.

Yes, OlGeorge.

You are right. Bayes error for the dev/test distribution must be lower than for the train distribution and so the answer should be false.

Hello @Mubsi could you please make the required changes. OlGeorge is right.