Course 3 Why don't you use early stopping?

NobuhitoKurose · May 19, 2021, 3:09am

In the video about Orthogonalization, Andrew said he did not use early stopping because it’s less orthogonalized.

But without early stopping how can I find the best parameters?

When searching the best parameter, why don’t you use early stopping?

For example about learning rate, I can find the best result only when I use early stopping because the best epochs depends on learning rate. Fixed epoch is useless. Or are there other strategy?

I mean without early stopping how to find the best epoch when tuning some hyper parameters.

hemant.ai · May 19, 2021, 6:23am

I think early stopping is just used for prevent the model from overfitting. You run the model and see at what epoch the overfitting starts and then choose that epoch.
Without early stopping you can also find out the best epoch just seeing when the validation error start increase. I think so.

Jaskeerat · May 19, 2021, 8:31am

Basically, the reason you don’t want to early stop is, by early stopping you are reducing training accuracy. And we normally want to approach our set bayes error. For image recognition tasks that is around 0%. Hence if you are at 96 percent accuracy, and you see that alright validation error is increasing so I should early stop. You might do that and get a higher test accuracy of say around 94%. But it will likely not be higher than training accuracy. And in this task 96 percent is not enough accuracy because we know we can do better.
Hence Andrew suggests to treat bias and variance as separate problems, the better technique is to reduce training error. Then look at bias and variance. So say you keep training till you achieve 99 percent accuracy. And now you run model on test set and see 94 percent accuracy. This is now solely a high variance problem, as you have attained low bias. Now you will concentrate on reducing variance by trying Regularization/increasing data etc. And say after all of this you get 99 percent training accuracy and 98 percent test.
That is a lot better than early stopping because you logically handled problems separately which told you a lot about what to do next.

NobuhitoKurose · May 31, 2021, 3:22am

Thanks!
I understand that first focusing to reduce bias and then focusing to reduce variance is better.
I didn’t think about reducing bias so much and too focused to prevent overfitting.

Topic		Replies	Views
C2_W1_early stopping/orthogonalization Improving Deep Neural Networks: Hyperparameter tun coursera-platform	3	512	September 5, 2022
Downside of Other Regularization Technique - Early Stopping Improving Deep Neural Networks: Hyperparameter tun coursera-platform	2	455	June 4, 2023
Changing NN architecture/hyperparameters and orthogonalization Structuring Machine Learning Projects coursera-platform	1	672	May 22, 2021
Early stopping - why dev set error starts to increase? Improving Deep Neural Networks: Hyperparameter tun coursera-platform	7	654	September 10, 2021
C1_W1_Lab05_iteration number selection Supervised ML: Regression and Classification week-module-1	6	590	June 20, 2022

Course 3 Why don't you use early stopping?

Related topics