Hello!

At first, I’d like to appreciate course team, you’ve done a really good job!

I think that I found an issue in lecture.

In second slide of lecture “[C2W1] Dropout Regularization” showen this code:

d3 = np.random.rand(a3.shape[0], a3.shape[1]) < keep-prob

a3 = np.multiply(a3, d3)

By this code, after element-wise multiplication with the second line, the neurons with chance greater than keep-prob should be equal to zero.

The inverse way with “d3 = d3 > keep-prob” looking more logical because this way we keeping elements with greater chances.

Is sign in this equation a misprint of lecturer?

Thanks in advance for your reply.

The random function being used there is the “uniform” distribution on the interval [0, 1], right? So suppose we want to keep 80% of the neurons (meaning *keep_prob = 0.8)*. If the output of *np.random.rand* is uniformly distributed on [0,1] then saying this:

`mask = random_output < keep_prob`

will result in approximately 80% of the elements of *mask* being 1 as opposed to 0, right? So roughly 80% of the values a3 will be preserved and not zeroed out. Of course all this behavior is statistical, so the exact number of 1’s on any given run may not be 80% even without quantization errors.

I think the slide is correct as written.

If you wanted to write the formula using >, you can do that, but it requires a bit more complexity. If *keep_prob = 0.8* then 80% of the elements of *random_output* will be > *(1 - keep_prob)*. However notice that gives the same statistical behavior, but the precise neurons being “zapped” will be different. You will fail the test cases in the notebook if you use the > scheme I just described.

1 Like

Thank you for the detailed explanation!

It makes sence