Hi all,
My code runs but doesn’t output the correct answer. Any clue? Thanks in advance for your help!
The question: forward_propagation_with_dropout
Implement the forward propagation with dropout. You are using a 3 layer neural network, and will add dropout to the first and second hidden layers. We will not apply dropout to the input layer or output layer.
Instructions: You would like to shut down some neurons in the first and second layers. To do that, you are going to carry out 4 Steps:
- In lecture, we dicussed creating a variable 𝑑[1]
with the same shape as 𝑎[1] using np.random.rand() to randomly get numbers between 0 and 1. Here, you will use a vectorized implementation, so create a random matrix 𝐷[1]=[𝑑1𝑑1…𝑑1] of the same dimension as 𝐴[1]* .
- Set each entry of 𝐷[1]
- to be 1 with probability (
keep_prob), and 0 otherwise.
Hint: Let’s say that keep_prob = 0.8, which means that we want to keep about 80% of the neurons and drop out about 20% of them. We want to generate a vector that has 1’s and 0’s, where about 80% of them are 1 and about 20% are 0. This python statement:
X = (X < keep_prob).astype(int)
is conceptually the same as this if-else statement (for the simple case of a one-dimensional array) :
for i,v in enumerate(x):
if v < keep_prob:
x[i] = 1
else: # v >= keep_prob
x[i] = 0
Note that the X = (X < keep_prob).astype(int) works with multi-dimensional arrays, and the resulting output preserves the dimensions of the input array.
Also note that without using .astype(int), the result is an array of booleans True and False, which Python automatically converts to 1 and 0 if we multiply it with numbers. (However, it’s better practice to convert data into the data type that we intend, so try using .astype(int).)
- Set 𝐴[1]
to 𝐴[1]∗𝐷[1]. (You are shutting down some neurons). You can think of 𝐷[1]* as a mask, so that when it is multiplied with another matrix, it shuts down some of the values.
- Divide 𝐴[1]
bykeep_prob. By doing this you are assuring that the result of the cost will still have the same expected value as without drop-out. (This technique is also called inverted dropout.)
My answer
{moderator edit - solution code removed}