Inverted Dropout

paulinpaloalto · June 10, 2021, 9:21pm

I don’t understand the statement in your question 1. Can you give a reference to where Prof Ng says that? The offset into the video would be most useful.

For 2) and 3), the point that you are missing is that dropout zeros certain specific neurons on each iteration. The actual neurons that are “zapped” are different (randomly) on each sample on each iteration. Then we need to compensate for those particular missing neurons by slightly increasing the magnitude of all the other neurons that we did not “zap” in that particular iteration. Thus the total amount of “activation energy” stays (roughly) the same, but it comes from different neurons. The whole point of dropout is that it weakens the connections between particular output neurons and the input neurons at the next layer. But we don’t want an overall reduction in the amount of “energy” being output, as expressed (for example) by the 2-norm of A for the layer. In particular for point 3) Prof Ng is making an analogy to the concept of “expected value” in statistics. Even though we are zapping some neurons in the layer each iteration, we want the “expected value” of the activations viewed at the aggregate level to stay roughly constant.

Topic		Replies	Views
Regularization by Inverted Dropout Improving Deep Neural Networks: Hyperparameter tun	1	687	August 12, 2021
Inverted dropout Intuition? Improving Deep Neural Networks: Hyperparameter tun	3	671	May 24, 2022
A lecture issue in dropout regularization implementation in week 1 Improving Deep Neural Networks: Hyperparameter tun	7	713	December 9, 2022
Inverted Dropout step Improving Deep Neural Networks: Hyperparameter tun	2	624	February 12, 2023
Why do you divide the activations by keep_prob when you use drop Improving Deep Neural Networks: Hyperparameter tun	7	702	May 22, 2023

Inverted Dropout

Related topics