Implementation keep_prob in dropout

In lecture it says that dropout zeros certain neurons randomly on each iteration and we are usign keep_prob for that. That is if keep_prob is 0.8, it means that 80% of neurons are retained and 20% are set as zero. But actual implementation was some what different, that is even when keep_prob is 0.8, 20% of values in matrix was not zero. I am adding code and their corresponding values below:

import numpy as np
np.random.seed(1)
D = np.random.rand(10, 1)
print(D)
D = (D < 0.8).astype(int)
print(D)

Output


[[4.17022005e-01]
 [7.20324493e-01]
 [1.14374817e-04]
 [3.02332573e-01]
 [1.46755891e-01]
 [9.23385948e-02]
 [1.86260211e-01]
 [3.45560727e-01]
 [3.96767474e-01]
 [5.38816734e-01]]
[[1]
 [1]
 [1]
 [1]
 [1]
 [1]
 [1]
 [1]
 [1]
 [1]]

Here after applying keep_prob all the values are 1, actually 2 of the values must be 0, right?.
Then only after multiplying it with A vector some of the neurons will be zero, isn’t it?

Please correct if something is wrong.

Hi @jijo,

Given the random nature of the process, that could happen. You have no guarantee that the droped neurons would be exactly 20%, it could be more or less depending on randomness.

There have been some discussions about it in this forum, so I encourage you to take a look at those. Some posts that could be helpful are below, but I’m sure there’s more.

1 Like

Exactly! It’s all statistical. Try your experiment several times in a row without resetting the random seed between and watch what happens. Here’s such an experiment:

np.random.seed(42)
keep_prob = 0.8
for ii in range(20):
    D = np.random.rand(10,1)
    D = (D < keep_prob).astype(float)
    print(f"{ii}: mean(D) = {np.mean(D)}")

Here’s what I get running that:

0: mean(D) = 0.8
1: mean(D) = 0.8
2: mean(D) = 1.0
3: mean(D) = 0.7
4: mean(D) = 0.9
5: mean(D) = 0.6
6: mean(D) = 0.7
7: mean(D) = 0.9
8: mean(D) = 0.8
9: mean(D) = 1.0
10: mean(D) = 0.9
11: mean(D) = 0.5
12: mean(D) = 0.6
13: mean(D) = 0.8
14: mean(D) = 0.9
15: mean(D) = 0.8
16: mean(D) = 0.9
17: mean(D) = 0.9
18: mean(D) = 0.7
19: mean(D) = 0.6

Now if you want to get really statistical: add a computation of the mean of the means and see how long it takes that to converge to keep_prob :nerd_face:.

1 Like

Hi @kampamocha,

Thank you for your response, now I got a better understanding.

Hi @paulinpaloalto,

Thank you for your response