Hi, say the keep_prob = 0.5 for a hidden layer with 5 hidden units. The chance that all the units are dropped is (0.5)^5 = 0.03125. In 100 iterations, it will occur ~3 times. Is the whole layer shut down then? How does the algorithm deal with it?
Thanks!
Repeated. Question about the dropout process