Bias and Variance

Alexander_Leon · July 29, 2022, 4:39am

On bias, the lecture mentioned you can measure bias by comparing to human level performance, competing algorithms performance, or a guess based on experience. How would I apply this advice to a Kaggle competition? I have a machine learning model I created that I want to test for bias. How would I accomplish this?

Elemento · July 29, 2022, 5:02am

Hey @Alexander_Leon,
In a Kaggle competition you would be having a dataset with you, and you can simply use the strategies discussed in the lecture video entitled “Model selection and training/cross validation/test sets”, i.e., to split your dataset into training, cross-validation and test sets, and then you can compare the performance on training and cross-validation sets, to find out whether your model is having high bias, high variance or both, as discussed in the lecture video entitled " Diagnosing bias and variance". Let me know if this helps.

Cheers,
Elemento

Alexander_Leon · July 29, 2022, 5:53am

@Elemento I’m thinking about the “Establishing a baseline level of performance” video. Using that video as reference, if I split my Kaggle dataset into three and computed that the training error was 10.8% and my cross-validation error was 14.8%, then does this demonstrate high bias or high variance?

Elemento · July 29, 2022, 6:05am

Hey @Alexander_Leon,
My bad. So, essentially your question revolves around “How to determine bias when there is no human-level (or baseline) performance available?”. There have been a great many discussions on this in the past, let me link a few of those:

You will find that these posts share a great deal of knowledge regarding your query. Do check these out.

Now, when it comes to a Kaggle competition, there is a simple hack. We just check out the top scorer in the leaderboard Let’s say that he/she receives a 1% error, so we can simply establish this as the baseline performance (since we want to beat the top scorer), and if this is the case, you can easily determine that your model has high bias and considerable variance as well, and off-you go! Let me know if this helps.

Cheers,
Elemento

Topic		Replies	Views
Bayes error, human-level performance and overfitting (structured data) Structuring Machine Learning Projects	7	723	July 9, 2022
Structuring DL project week 1 questions Structuring Machine Learning Projects week-1	17	58	July 23, 2024
Week 1 - Question 10. Debate Structuring Machine Learning Projects	14	1654	November 18, 2022
Bias/Variance - a quantitative measure (?) Improving Deep Neural Networks: Hyperparameter tun week-1	7	245	March 11, 2024
Human level of performance for structured data Structuring Machine Learning Projects week-1	16	55	September 20, 2024

Bias and Variance

Related topics