Given the image above from the lecture. High Bias is when the model is not performing well on the train set and high variance is when the model is performing exceptionally well on the train set.
Given:
- Train set error: 15%
- Dev set error: 30%
Why do we conclude that this model has a high bias AND high variance? I would think that this is still high bias because it fails on the train set and fails even more on the dev set.
Is it high bias because it fails on the train set and then high variance because it fails even more on the dev set?
Do we conclude that a model has high bias when it fails the train set and High variance when it fails the dev set? And since this model fails both, we say that it has high bias and high variance?
I think I figured it out but I will leave this thread open until someone verifies this for me
So, we check the train set, if the percentage is large (15%) then we have High bias. Then we see how much bigger is the dev set, compared to the train set. If the difference is small (15% train / 16% dev) then we conclude that we only have High Bias. If the difference is large tho (15% train / 30% dev), then we conclude that our model should go into the trash because we have High Bias (15% train) and High Variance (Performance is even worse on dev set compared to train set). But if the train set is like 1% and dev set is 15% then we say that we only have High Variance because the difference in performance is still large but not high bias because the train set error is small.
This is the general idea right?