Week1 - Programming Assignment: Regularization - dropout code

manish_tiwari · May 14, 2021, 12:49pm

The dropout is supposed to drop nodes in the hidden layer randomly for each iteration. Are we not supposed to drop the same node for all m inputs in the training set?

The programming exercise required me to initialized the dropout matrix for the first hidden layer as follows:
D1 = np.random.rand(A1.shape[0],A1.shape[1])
D1 = (D1 < keep_prob).astype(int)

This would cause different nodes to be dropped for the “m” inputs. If we should drop the same node for all m examples, we should initialize D1 as a Vector of shape (A1.shape[0], 1) and use broadcasting to multiple A1 and D1.

I tried to run my code by initializing D1 as a vector and using broadcasting, but the test cases were failing.

Please clarify.

Thanks,
Manish

javier · May 14, 2021, 1:23pm

Hi @manish_tiwari , the idea in dropout is turn off nodes randomly for each sample.

If you turn of the same nodes for all m inputs, you will be optimizing a different NN, a smaller one. So your NN performance will decrease significantly.

Even if you change the dropout nodes once per epoch, the end result will be not as good as selecting the random nodes for each example.

taohaoxiong · November 21, 2021, 1:46pm

I have have the same question as @manish_tiwari, since it mentioned “When you shut some neurons down, you actually modify your model” in the “Regularization” assignment description.

Regarding the performance, I tried to implement using:
D1 = np.random.rand(A1.shape[0],1)
D2 = np.random.rand(A2.shape[0],1)

by using keep_prob = 0.9, learning_rate = 0.3, the result is actually not worse with:
On the train set:
Accuracy: 0.933649289099526
On the test set:
Accuracy: 0.955

paulinpaloalto · April 1, 2022, 7:40pm

Yes, this issue has been noticed and discussed a number of times. Here’s a thread which shows more investigations which come to the same conclusion that you show: it probably doesn’t make all that much difference if you do the dropout differently per sample or consistently per sample in a given minibatch.

Topic		Replies	Views
Doubt regarding D1 and D2 in dropout regularization in assignment 2 of week 1 Improving Deep Neural Networks: Hyperparameter tun	1	537	November 29, 2021
Implementing dropout regularization Improving Deep Neural Networks: Hyperparameter tun	3	632	May 14, 2022
Inverted dropout, killing nodes or stabbing training examples? Improving Deep Neural Networks: Hyperparameter tun	9	1035	May 15, 2022
Dropout Frequency Improving Deep Neural Networks: Hyperparameter tun	1	580	June 28, 2021
Week1 ex2 np.random.seed(1) in forward_propagation_with_dropout Improving Deep Neural Networks: Hyperparameter tun	7	761	August 10, 2021

Week1 - Programming Assignment: Regularization - dropout code

Related topics