About dropout and convergence of loss function

bob08151990 · September 15, 2021, 8:18pm

I have a question about the dropout. In the lecture, it mentions that one can show that the dropout is similar/equivalent to adaptive L2 regularization. However, in each epoch, the loss function is no longer an invariant quantity. How can we ensure that the error will eventually converge? It’s not clear to me.

paulinpaloalto · September 16, 2021, 3:11am

There is never any guarantee that gradient descent will converge with any particular choice of hyperparameters, which includes your choice of regularization method and values associated with that (\lambda or the dropout probabilities). So you try. And when/if it fails, you adjust hyperparameters and try again. The general method is described by Prof Ng in Week 3 of Course 2.

paulinpaloalto · September 16, 2021, 3:36pm

The other point worth making here is something that Prof Ng discusses in the dropout lectures:

The purpose of regularization is to address overfitting problems. Meaning that you already have come up with a set of hyperparameter choices such that the that training converges to a solution. The problem is just that the solution overfits. So if you already have convergence, adding regularization is probably not going to disturb that aspect too much. But the solution surfaces here are pretty complex, so I guess anything is still possible and you may need to adjust other hyperparameters in the process. No guarantees, but you probably want to start with relatively “mild” values for the dropout probability (i.e. close to 1) and tune from there.

Topic		Replies	Views
Understanding Dropout Lecture Video Improving Deep Neural Networks: Hyperparameter tun coursera-platform	3	591	June 13, 2021
DLS Course 2 Hyperparameter tuning week 1 Improving Deep Neural Networks: Hyperparameter tun coursera-platform	3	560	May 14, 2021
Difference between cost function of L2 and dropout regulariztion - Week1 Improving Deep Neural Networks: Hyperparameter tun coursera-platform	3	562	December 19, 2022
Course 2. Regularization Improving Deep Neural Networks: Hyperparameter tun coursera-platform	1	517	April 23, 2022
Using L2 Regularization when overfitting issue is minor Improving Deep Neural Networks: Hyperparameter tun coursera-platform	1	533	September 30, 2021

About dropout and convergence of loss function

Related topics