Hi all,
My code runs but doesn’t output the correct answer. Any clue? Thanks in advance for your help!
The question: forward_propagation_with_dropout
Implement the forward propagation with dropout. You are using a 3 layer neural network, and will add dropout to the first and second hidden layers. We will not apply dropout to the input layer or output layer.
Instructions: You would like to shut down some neurons in the first and second layers. To do that, you are going to carry out 4 Steps:
- In lecture, we dicussed creating a variable 𝑑[1]
with the same shape as 𝑎[1] using np.random.rand()
to randomly get numbers between 0 and 1. Here, you will use a vectorized implementation, so create a random matrix 𝐷[1]=[𝑑1𝑑1…𝑑1] of the same dimension as 𝐴[1]* .
- Set each entry of 𝐷[1]
- to be 1 with probability (
keep_prob
), and 0 otherwise.
Hint: Let’s say that keep_prob = 0.8, which means that we want to keep about 80% of the neurons and drop out about 20% of them. We want to generate a vector that has 1’s and 0’s, where about 80% of them are 1 and about 20% are 0. This python statement:
X = (X < keep_prob).astype(int)
is conceptually the same as this if-else statement (for the simple case of a one-dimensional array) :
for i,v in enumerate(x):
if v < keep_prob:
x[i] = 1
else: # v >= keep_prob
x[i] = 0
Note that the X = (X < keep_prob).astype(int)
works with multi-dimensional arrays, and the resulting output preserves the dimensions of the input array.
Also note that without using .astype(int)
, the result is an array of booleans True
and False
, which Python automatically converts to 1 and 0 if we multiply it with numbers. (However, it’s better practice to convert data into the data type that we intend, so try using .astype(int)
.)
- Set 𝐴[1]
to 𝐴[1]∗𝐷[1]. (You are shutting down some neurons). You can think of 𝐷[1]* as a mask, so that when it is multiplied with another matrix, it shuts down some of the values.
- Divide 𝐴[1]
bykeep_prob
. By doing this you are assuring that the result of the cost will still have the same expected value as without drop-out. (This technique is also called inverted dropout.)
My answer
{moderator edit - solution code removed}