Assignment 2 Q7

Hi, my question is based of Q7.
Question 7
Based on the table from the previous question, a friend thinks that the training data distribution is much easier than the dev/test distribution. What do you think?

What does it mean that ‘training data distribution is much easier than the dev/test distribution’?

UPDATED: comments on this question were removed.

Hi @shamus , please do not post direct references to (correct/incorrect) answers in the quizzes. We are here to help each other, but this is an spoiler to other students in the course

So sorry I forgot about that. Apologies

Are but please explain what it means.

By “data distribution is much easier” or “Bayes error for the data distribution is lower”, they mean that the data distribution of a particular set is closer to the true data distribution.

This is the part that I am having trouble wrapping my head around. It seems in the lecture it was simply “assumed” that human level performance on rear view mirror speech data is higher than that for general speech data (6% compared to 4%). Couldn’t it very well be also around 4%? In which case this entire argument breaks down? This part is not really explained with rigor and we are to assume that whenever dev/test error is lower than train/train-dev that the human level performance for the dev-test distribution will always be lower? It doesn’t make a lot of sense to me to make that assumption.

I didn’t get the correct answer for this one myself. But I think that Professor Ng explained, maybe briefly. If a dataset contains many blurry/noisy images, the intrinsic level for making an incorrect classification (Bayes error) is high. We should assume that a dataset from a different distribution source has a different Bayes error.

Can we only compare the error with human-level error when the training dataset has the same distribution with dev/test distribution?

This is my understanding: if there is a data distribution mismatch, the human error can represent the benchmark for dev/test but cannot be used for training dataset.