Prof. Ng says it is possible for a model to have high bias and high variance. How is this possible? Aren’t they both opposite extremes of model performance?
I agree with you, in my view it’s not possible to have both at the same time.
I’m not sure why Andrew says that.
So what conclusions can be drawn about the performance of a model when the training error is higher than the baseline performance and the cross validation error is higher than the training error?
hi @ai_is_cool
That statement about high bias and high variance is based on train set error and dev (validation) set error rate
Here is a discussion which explains a little more detail using the course material as well as other data
In some cases, when a model is created using a validation and train set dataset and then used in test dataset, the model might have given greater result on the train set with validation set (overfit) where as when the same model was tested on test set must have shows a poor performance indicating high bias on train set compared to test set.
Regards
DP
Possibilities:
- The total amount of data is not sufficient to create a useful model.
- The statistics of the various splits (training, validation, and test) are not sufficiently similar.
- Bad luck in the random selection of the data splits.
@TMosh So when Andrew says it indicates high bias AND high variance, is he wrong?
Is this something that needs correction @Wendy?
I don’t have any ready examples of simultaneous high variance and high bias.
But Andrew consistently has made this statement in all his “ML Introduction” courses. So he must have some worst-case scenario in mind. Certainly he’s vastly more experienced than I am.
I just don’t know what that scenario might be.
I’m confused. What statement are you referring to that Andrew has consistently made throughout all his ML Introduction courses?
I don’t see how a model could perform poorly with both high bias and high variance, simultaneously under-fitting and over-fitting an input training dataset.
This statement of Andrew’s:
Background: I’ve been mentoring Andrew’s ML intro courses for 10 years now.
Every one of them contains that same statement, with no examples given.
Yes, this is what he has stated but how is that physically possible when high bias and high variance are polar and opposite extremes of a poorly performing model?
So how can it be explained that a model can have simultaneously high bias and high variance?
I’ve already said I do not have an explanation for that statement. I apologize for my lack of understanding of Andrew’s intentions.
I agree with @TMosh that Prof. Ng has very intentionally made the point that, while it is rare, there are some neural networks where you could see both high bias and high variance. And it sounds like he has seen these situations and wants students to be aware it is possible.
The example he gives in the video to give us an intuition about how this could be is that the model might overfit in some places and underfit in others.
I’m confused.
High bias is where the b term dominates and high variance is where the weight terms and input features dominate in a linear regression problem - correct?
No, I don’t think the terms “bias” and “variance” in this context map directly to the weight and bias values. They’re more descriptive terms, perhaps rooted in statistics, rather than directly pointing at the learned parameters.
He was referring to linear regression not neural networks in the video lesson.
A true example of high bias is considered when one uses linear regression analysis of data to create a model instead of non-linear regression analysis between data points like for house prices prediction analysis if number of bedrooms is only considered to be related with housing prices where as housing price had more features like number of windows or ventilation, presence of garage, or basement, square feet area.
Here if the validation data was only taken as linear relationship between the number of bedrooms to housing price and training data it still showed a poor performance as other features had more significant factor to housing prices and not using these features causes model to perform poorly resulting in high bias.
Same way when this model was used in a different housing price analysis of different location (basically unseen data) for the model to fit to very well with the training data is considered high variance.
High bias and high variance is most often found when tested models are trained on unseen data especially in medical researches where a disease to cause prediction is considered and validation data contains the specific feature of cause to effect relation not considering the confounding factors which then on unseen data of patients suspected of the disease is trained on this trained model is fits too well.
Also just to point what you stated that Andrew Ng mentions linear regression and not neural network, he has mentioned the same error rate in neural network too. Remember neural network algorithm uses statiscal analysis be it linear or non-linear relations between data points and output. So when he is mention regression analyses of these error he is more pointing on the relationship between an input to output relation in a neural network algorithm which could linear or non-linear based on the features or parameters present in a data distribution.
Thanks but I’m not really understanding your post.
Can anyone provide me with an example of linear regression in the context of Andrew’s video lesson in Week 3 that demonstrates high bias AND high variance?
housing price example was from the Andrew ng video explanation.
In predicting housing prices, a simple linear regression model may underfit (high bias) because housing prices are influenced by multiple factors in a non-linear way. however, a very complex model like a deep neural network may overfit (high variance) if the training data is not sufficiently large or diverse.