Network size and bias variance tradeoff

Sara · April 25, 2021, 6:48am

I got some mix message from week 2’s videos:

In Basic Recipe for Machine Learning, the recipe for improving a deep learning model with low bias and high variance includes trying out regularization. And Andrew also mentioned that unlike traditional machine learning methods, larger networks in deep learning don’t face the bias-variance tradeoff.
==> message 1: large network reduce bias but don’t increase variance

However, in a later video explaining how regularization helps prevent over fitting, I think it said that regularization makes w small and reduces the effects of neurons, which acts like training on a more simple network.
==> message 2: simple (small) network reduce variance, and complex (large) network increase variance

I imagine the answer to this confusion has something to do with we aren’t actually training on a smaller network, or that simple isn’t necessarily small. But I need help clarifying this confusion.

Another question is how does increasing the size of the network reduce bias and not increase variance? Wouldn’t more neurons and layers give the network more freedom to make individual decisions for the samples it was trained on?

albertovilla · April 25, 2021, 8:01am

Hi @Sara,

I’m not sure about the specific context where Andrew mentioned the first message, I haven’t checked the content of course 2 in some time but I don’t think the conclusion 1 is what he meant. In fact, in this book “Machine Learning Yearning” says:

“Increasing the model size generally reduces bias, but it might also increase variance and the risk of overfitting. However, this overfitting problem usually arises only when you are not using regularization. If you include a well-designed regularization method, then you can usually safely increase the size of the model without increasing overfitting.”

In the book Andrew explains in more detail techniques to reduce bias and variance. I don’t think I’m allowed to post it here although he made available a draft version of the book online. In any case, the summary is as follows:

Techniques for reducing avoidable bias

Increase the model size
Modify input features based on insights from error analysis
Reduce or eliminate regularization
Modify model architecture

Techniques for reducing variance

Add more training data
Add regularization
Add early stopping
Feature selection to decrease number / type of input features
Decrease the model size
Plus the techniques #2 and #4 from the bias reduction

I hope this helps.

Sara · April 25, 2021, 2:04pm

Hi @albertovilla,
Yes this makes a lot more sense. Thanks! But I have a new question regarding your post. Can you please explain how decreasing the number of input feature would help reduce variance?

albertovilla · April 26, 2021, 5:58pm

Hi @Sara, Andrew’s explanation surely is better than mine

Feature selection to decrease number/type of input features: This technique
might help with variance problems, but it might also increase bias. Reducing the number
of features slightly (say going from 1,000 features to 900) is unlikely to have a huge effect
on bias.

Reducing it significantly (say going from 1,000 features to 100—a 10x reduction)
is more likely to have a significant effect, so long as you are not excluding too many useful
features.

In modern deep learning, when data is plentiful, there has been a shift away from
feature selection, and we are now more likely to give all the features we have to the
algorithm and let the algorithm sort out which ones to use based on the data. But when
your training set is small, feature selection can be very useful

Topic		Replies	Views
Bias Variance tradeoff Improving Deep Neural Networks: Hyperparameter tun coursera-platform	10	1001	September 24, 2021
Week 1: dropout vs reducing network? Improving Deep Neural Networks: Hyperparameter tun coursera-platform	14	1367	August 19, 2023
Large Neural Networks and bias/variance Advanced Learning Algorithms week-module-2	1	471	February 24, 2023
How does increasing the neural networks layers (making it bigger) help reduce the bias? Advanced Learning Algorithms week-module-2	3	344	October 5, 2023
Course 2, Week 1, Exercise Regularization Improving Deep Neural Networks: Hyperparameter tun coursera-platform	4	621	June 5, 2021

Network size and bias variance tradeoff

Related topics