Does dropout intensity depend on input or output of layer?

chemgeostats · October 8, 2022, 7:54am

In the Understanding Dropout lecture, professor Ng mentions in an example that the 2nd layer of an NN has many weights (7x7 i.e. 7 input & 7 output), hence it needs a lower keep_prob (0.5) to regularize it more.

Am I correct in saying that this layer needs more regularization more because of its high number of input node (7), and not necessarily the number of output nodes (also 7)? This is because earlier on in the lecture, he explained that dropout spreads out the weights across the the input nodes (using an example of a node with multiple input, each with an associated weight). In other words, dropout seems to primarily affect the input weights of the layer.
Does this also mean that the layer 3 (3x7 weight matrix) also needs more regularization/lower keep_prob value (perhaps also 0.5, rather than the 0.7 shown in the lecture)? This is because this layer also has a high number of input nodes (7).

Thank you in advance for helping clarify my confusion!

alvaroramajo · October 8, 2022, 8:36am

Hi, @chemgeostats !

You’re right about increasing the ratio of dropped weights when you need more regularization. That is the basic concept. Nonetheless, and you will see it when you advance through the course, a ~50 weights layers is far from having many weights. You can check efficientnet or resnet with layers that have millions of weights.

Therefore, adjusting the dropout ratio is more of an art sometimes when you need to optimize it for your particular case, although you can see some basic patterns and logic underneath.

Topic		Replies	Views
Course2 week1 video"understanding dropout" Improving Deep Neural Networks: Hyperparameter tun	6	542	September 27, 2022
A lecture issue in dropout regularization implementation in week 1 Improving Deep Neural Networks: Hyperparameter tun	7	710	December 9, 2022
Inverted dropout Intuition? Improving Deep Neural Networks: Hyperparameter tun	3	668	May 24, 2022
Dropout technique makes me confused Improving Deep Neural Networks: Hyperparameter tun	7	720	May 12, 2022
Week1 - Programming Assignment: Regularization - dropout code Improving Deep Neural Networks: Hyperparameter tun	3	693	April 1, 2022

Does dropout intensity depend on input or output of layer?

Related topics