Decreasing Regularization

heeseong_kim · October 30, 2021, 12:42pm

In C3W1 Bird Recognition problem number 10,

The answers of the question are (1) Train a bigger model (2) Try decreasing regularization.
I understand the answer (1). But I do not understand how decreasing regularization can help in this situation.

jonaslalin · November 1, 2021, 7:25pm

Because we are currently underfitting the training data, by using too much regularization, which has resulted in a model that is too simple:

A model that exhibits small variance and high bias will underfit the target, while a model with high variance and little bias will overfit the target.

heeseong_kim · November 2, 2021, 12:27am

Thank you for your answer.
I have one more question.

In the course, I learned as follows:
If we are “underfitting” the training data → larger network, train longer
If we are “overfitting” the training data → regularization, more data

So, if we are “overfitting” the training data, can we use “smaller” network and train “shorter”? Will this be appropriate as applying more regularization and getting more data?

jonaslalin · November 2, 2021, 10:12am

Using a smaller network and regularizing the network is similar in the sense that regularization results in fewer weights doing the heavy lifting, which is the same as using a smaller network where all those weights are participating in the heavy lifting.

During hyperparameter tuning, the first thing to make sure is that you train for the right number of epochs, i.e., before you start overfitting because you train too long. Hence, you always use early stopping in some way or the other.
Next, you have to check how you do compared to your validation set. If you overfit, you add more data if possible, otherwise you start regularizing, by penalizing weights, adding dropout, decreasing network size or any other method you can think of.

Goodfellow writes

in practical deep learning scenarios, we almost always do find—that the best fitting model (in the sense of minimizing generalization error) is a large model that has been regularized appropriately…

Topic		Replies	Views
Network size and bias variance tradeoff Improving Deep Neural Networks: Hyperparameter tun coursera-platform	3	655	April 26, 2021
[Course 2] Regularization effect with Smaller NN Improving Deep Neural Networks: Hyperparameter tun coursera-platform	1	551	August 7, 2022
Overfitting or Underfitting - what is the issue? Convolutional Neural Networks in TensorFlow week-module-4	2	656	January 3, 2023
Effect of regularization on bais Structuring Machine Learning Projects coursera-platform	3	551	January 26, 2022
Why Regularization Reduces Over Fitting Lecture Improving Deep Neural Networks: Hyperparameter tun coursera-platform	1	498	April 9, 2022

Decreasing Regularization

Related topics