Hello, and thank you for being here. In one of the lectures, the professor mentioned that we need a standard to determine if our model is highly biased or variant, and he named this standard as " Bayes error. my question is how can we determine the value of the said error?
thank in advance
It is an important question, but unfortunately does not have an easy answer. The definition of Bayes Error is the best that a model can possibly do in terms of prediction accuracy. But the problem is that there is no theoretical way to determine that a priori. You can only reason about it from experimental results basically. What we can say is that by definition:
Bayes Error <= Human Error
So the normal approach is to use the measured Human Error on the prediction test as a “proxy” for the Bayes Error. At least that gives you an Upper Bound for it. Of course there are further subtleties in defining Human Error as well: you can start by sampling the prediction error that human subjects make on your sample data. First just use a general population of people to run the test. Then you can also try with someone who is a trained expert in the field. E.g. if you are analyzing medical image data to detect some disease or condition, the expert would a radiologist. Then you can even try the extreme case of using a group or committee of trained experts and get their predictions. Of course what you find there is:
General Human Error >= Trained Expert Error >= Group of Trained Experts Error
So the latter value (if you can afford to run that test) would be the best estimate for the Bayes Error.
Prof Ng will discuss this quite a bit more in Course 3, so maybe the best idea is to “hold that thought” and listen to what he says in Course 3.