Course 2 -- Week 1 -- Dropout

Zoooeee · June 28, 2021, 11:10pm

Hello, I have a question about the dropout implementation. When we implement dropout, we need to rescale the A[l] by dividing keep_prob so that the expected output stays the same. I do not quite understand this. Because the dropout is implemented for every iteration, and in each iteration, the realization of the dropout is different from the expected value (e.g., for keep_prob = 0.5 and 3 neurons, it is possible to keep all of them in one iteration but we multiple A[l] by 2 anyway, which will increase the output value).

Why don’t we rescale A[l] using the realized number of dropouts in each iteration directly? Because we can easily keep tracking of the realized number of dropout by looking at the D[l] matrix. Thank you.

kampamocha · June 28, 2021, 11:25pm

Hi @Zoooeee,

this question has been asked before, maybe the following posts can be useful:

Hope that helps.

Topic		Replies	Views
Doubt about the implementation of inverted dropout Improving Deep Neural Networks: Hyperparameter tun coursera-platform	5	830	November 19, 2024
[C2W1 - Regularization] A question about inverted dropout scaling factor Improving Deep Neural Networks: Hyperparameter tun coursera-platform	3	1085	January 27, 2024
Regularization by Inverted Dropout Improving Deep Neural Networks: Hyperparameter tun coursera-platform	1	687	August 12, 2021
Inverted Dropout - Query Improving Deep Neural Networks: Hyperparameter tun coursera-platform	1	651	June 4, 2022
Week 1 -Possible Mistake on Lecture Video? Improving Deep Neural Networks: Hyperparameter tun week-module-1 , coursera-platform	4	33	March 4, 2025

Course 2 -- Week 1 -- Dropout

Related topics