Hello!
At first, I’d like to appreciate course team, you’ve done a really good job!
I think that I found an issue in lecture.
In second slide of lecture “[C2W1] Dropout Regularization” showen this code:
d3 = np.random.rand(a3.shape[0], a3.shape[1]) < keep-prob
a3 = np.multiply(a3, d3)
By this code, after element-wise multiplication with the second line, the neurons with chance greater than keep-prob should be equal to zero.
The inverse way with “d3 = d3 > keep-prob” looking more logical because this way we keeping elements with greater chances.
Is sign in this equation a misprint of lecturer?
Thanks in advance for your reply.
The random function being used there is the “uniform” distribution on the interval [0, 1], right? So suppose we want to keep 80% of the neurons (meaning keep_prob = 0.8). If the output of np.random.rand is uniformly distributed on [0,1] then saying this:
mask = random_output < keep_prob
will result in approximately 80% of the elements of mask being 1 as opposed to 0, right? So roughly 80% of the values a3 will be preserved and not zeroed out. Of course all this behavior is statistical, so the exact number of 1’s on any given run may not be 80% even without quantization errors.
I think the slide is correct as written.
If you wanted to write the formula using >, you can do that, but it requires a bit more complexity. If keep_prob = 0.8 then 80% of the elements of random_output will be > (1 - keep_prob). However notice that gives the same statistical behavior, but the precise neurons being “zapped” will be different. You will fail the test cases in the notebook if you use the > scheme I just described.
1 Like
Thank you for the detailed explanation!
It makes sence 