Question about lecture - Error analysis

Hello,

I have a question about the lecture in week 3 about Error analysis under “Machine learning development process”. Around the 5:13 mark, he says you can use more data and add more features to combat the misclassification of emails. Why is this the case?

I thought if the model misclassified the email it meant the model could not grasp the complexity of this problem, and thus it is a high-bias problem. And earlier he said a high bias problem is typically not solved by adding more data. I do understand why you would use more features.

Thanks,
Flim

Adding more data means that the fitting has to adapt to these data as well so maybe it doesn’t overfit anymore!

1 Like

Hi @Flimdejong ,

This lecture on Error Analysis is showing a different method to diagnose a learning algorithm performance problem. Having a view of which categories of data are mis-classified most often, then, getting more data or engineering more features is a way to help the algorithm to learn and improve its accuracy.

1 Like

how did you conclude it overfits?

That’s not conclusive analysis.

I can understand that yes. But that conclusion is from what I understood from the lectures.

You can just as easily get incorrect predictions from overfitting (high variance) or underfitting (high bias).

Hello @Flimdejong,

To begin with, I think you are right that adding more data wouldn’t help a high-bias problem, and that’s exactly why we should doubt whether the model misclassifying emails to be due to high-bias if someone said adding data helped.

If you watch the video again at 6:34, 7:00, and 8:00, Andrew had repeatedly mentioned variance too (especially 8:00). I recommend you to go through that video again and this one too. Sometimes, reviewing a video twice in another day can give us a different view :wink: For example, how would we diagnose a bias and a variance problem? This is a must-clearly-know.

Cheers,
Raymond

1 Like

Ah I totally missed that. I will go through the video again. Thank you for your response!