Bias and variance;

Can someone provide some Python code that demonstrates with plots how a simple linear regression model can exhibit both high bias AND high variance as stated in Andrew’s Week 3 video lesson?

A plot of the model predicted output, target labels and input training samples would be great so that I can visually observe both high bias and high variance.

Hi Stephen,

How about this example? It is a regression problem though.

Hi Pavel,

From Andrew’s Week 3 lesson he talks about how both high bias and high variance can occur in a linear regression model but doesn’t actually provide an example.

So it would be great if I could see what this looks like on a Python plot.

@ai_is_cool,
If you open the notebook in Colab, you should be able to see the plot.

Yes, a model can have both high bias and high variance. This happens when the model doesn’t learn the main patterns in the data (high bias) but still reacts too much to small details or noise in the training data (high variance). It means the model is both confused and too sensitive — not learning well and also overreacting.

Can you show me a plot of a simple linear regression model for say predicting house prices based on floor area in square feet so that I can better visualise high bias and high variance simultaneously?

Thanks but I’m not seeing high variance and high bias in the ground truth target values and the model’s predictions.

I don’t expect a model for that simple of a data set would exhibit any unusual behavior.

Can you provide a Python implementation for the simplest linear regression model that exhibits high bias and high variance simultaneously with plots that show this behaviour visually?

How can a model under-fit the training data AND over-fit at the same time? What does it look like on a plot?

I think he was speaking in general, but definitely he explained it was more common for neural networks. His exact words from the video lesson were:

“You won’t see this happen that much for linear regression, but it turns out that if you’re training a neural network, there are some applications where unfortunately you have high bias and high variance.”

2 Likes

I have never seen a concrete example of that. I’d really like to study it, if anyone can post a dataset and model that demonstrate this.

Yes but how can a model exhibit two opposite extremes of poor performance - both high bias and high variance?

Can anyone produce a plot of what this looks like visually with target and predicted values against training and cross validation datasets?

“someone” might be able to, but if so they aren’t active on this thread.

I just can’t see how it’s possible but then maybe I just don’t know enough yet or have enough experience.

This is the example diagram Prof. Ng used in the video lesson to show some input values overfitting and some input values not fitting well at all. Although he did say this doesn’t really happen for linear models applied to 1D, and it is just to give an intuition:

He also said that an indication that you have both high variance and high bias would be if Jtrain is high (indicative of high bias), and Jcv >> Jtrain (indicative of high variance)

Just putting this out there for anyone reading this who knows better than we do, in case this triggers a thought of a real-world neural network example that matches these criteria.

1 Like

I’ve seen that also, but it falls short of being a real practical example. It’s more like a bit of artistic hand-waving.

Or, is he trying to draw a dataset that has a high density of examples in the left side of the graph, but a sparse density on the right side?

That’s a possibility.

If you include too many polynomial terms, then the left side might be prone to overfitting, and the right side prone to underfitting. Mash all that together under one cost calculation, and you’re likely to get bad cost for all the data sets (training, validation, and test).

Essentially, it would mean the model is a poor choice.

I’ll try to invent some data and a few generate results.

2 Likes

Which video lesson does this plot appear in?

I updated my example by adding polynomial features. Bias and variance estimates were obtained by performing 200 bootstrap rounds for the bias-variance decomposition.

What is a bootstrap round?

What is bias-variance decomposition?

I haven’t come across these terms from Andrew’s course so far.