Bias Variance tradeoff

ajaykumar3456 · May 7, 2021, 12:09pm

Now In the deep learning era, When implementing regularisation for the problem of high variance, will bias be effected?
and when increasing the complexity of the Neural Network for the problem of high bias, does it lead to overfitting?

OR Bias and variance are independent of each other i.e reducing bias will not increase variance and reducing variance will not increase bias.
Please explain?

javier · May 7, 2021, 7:50pm

Hi @ajaykumar3456 , welcome to the DLS !!

This is an interesting question, but the answer will depend in your current model structure and the changes you apply to it.

For example, if you have a High Bias problem, you may increase the size of your NN, but this change may be relatively small and not enough to increase the network overfitting significantly.

In the same way, applying techniques to reduce Variance, like regularization, dropout, etc. might not increase the Bias significantly if the model was not overfitting.

Also there are some recent papers that argue you can reduce both Bias and Variance increasing the width of a NN like: A Modern Take on the Bias-Variance Tradeoff in Neural Networks

In conclusion, you’ll have to test your model Bias and Variance with each change to really see how it is affected. To that end, you can follow Andrew’s recipe: Basic Recipe for Machine Learning

ajaykumar3456 · May 8, 2021, 6:30am

First of all, thanks for your explanation @javier.
Increasing the width means adding more hidden layers?
and
“In the same way, applying techniques to reduce Variance, like regularization, dropout, etc. might not increase the Bias significantly if the model was not overfitting.”
from your explanation,
we only apply regularisation or dropout only when the model is overfitting for the purpose of reducing it. But, In the last line you said bias will not increase if the model was not overfitting
Please explain?

manifest · May 8, 2021, 9:12am

Hey @ajaykumar3456,

As @javier pointed out exact actions would higly depend on the model we are working on. The process is repetitive meaning that:

We make a decision on what actions to apply to reduce bias or variance in the model.
We estimate errors on a train set and a dev set (it sometimes called a validation set).
If we are not satisfied with results, we make a decision on what actions to apply again.

To understand what problem we are facing, we estimate an error on a train set and an error a dev set and then compare these errors:

If training set error is high, we have a high bias problem.
If training set error is low, but dev set error is high, we have a high variance problem.
If both of the errors are high, we have high bias and high variance.

The rule of thumb to achieve a low error on the train set first. It also means that we address a high bias problem first. To do that we can:

Increase model capasity (e.g. increase number of hidden units and/or layers).
Increase mini-batch size.
Use additional features.
Just train our model for longer time.

If only we have got a low error on the training set, we start working on reducing the error on the dev set. It also means that we start addressing a high variance problem. For that purpose we can:

Collect more data.
Apply regularization techniques.

ajaykumar3456 · May 8, 2021, 10:14am

Thanks @manifest for your explanation. Understood the concept

javier · May 9, 2021, 2:31am

Hi @ajaykumar3456, there is not much that I can add to @manifest great explanation.
But just to clarify, the point I tried to make is that applying a technique to reduce Bias not always will increase your Variance significantly, and applying a technique to reduce Variance not always will increase Bias significantly.
So you should always test Bias and Variance before and after making changes to the model.

ajaykumar3456 · May 9, 2021, 5:35am

Thank you @javier. You made me clear.

Sara · May 23, 2021, 8:40am

Hi @manifest
In the course I think we only learned about how the minibatch size increase training speed. Could you please explain how the mini-batch size affect bias as well?

manifest · May 23, 2021, 9:13am

Sure. Small batches provide some regularizing effect, perhaps due to the noise they add to the learning process. If we have high bias we may want to decrease the regularizing effect.

sbovdey · September 24, 2021, 1:29pm

Hi @manifest
I thought small batches due to the noise prevent from converging near the destination (in case we have not decayed alpha) and so is similar to early stopping. If I right you can decrease such regularizing effect not only by increasing batch size but also by alpha decay. Am I right?

manifest · September 24, 2021, 3:03pm

Hey @sbovdey,

That’s correct. With minibatches we may not converge on every step. That’s the reason why we’re usually decreasing the learning rate during training.

Topic		Replies	Views
Network size and bias variance tradeoff Improving Deep Neural Networks: Hyperparameter tun coursera-platform	3	652	April 26, 2021
Course 2, Week 1, Exercise Regularization Improving Deep Neural Networks: Hyperparameter tun coursera-platform	4	621	June 5, 2021
Week 1: dropout vs reducing network? Improving Deep Neural Networks: Hyperparameter tun coursera-platform	14	1366	August 19, 2023
Regularisation for high variance only Improving Deep Neural Networks: Hyperparameter tun coursera-platform	2	511	June 30, 2022
How does increasing the neural networks layers (making it bigger) help reduce the bias? Advanced Learning Algorithms week-module-2	3	344	October 5, 2023

Bias Variance tradeoff

Related topics