Hi @plutonic18
dropout sets randomly activations of neurons to zero. By this regularization we can use induce additional„noise“ to the training process which can prevent overfitting of the neural network.
Let’s take an illustrative example and let’s set the dropout rate very high: Imagine you have a dropout rate of let’s say 50 %. Then half of your activations would be missing since they would be „dropped out“ = set to 0 output.
Besides that the training itself would probably not succeed, the biggest reason from my perspective is the following:
If we would not compensate for the dropped-out activations, the model performance would be very much problematic due to activations which are just not representative. If you are in doubt why, feel free to take a look how a typical histogram as statistical distribution of the output activations looks like as described here.
Accordingly, the model performance would suffer a lot due to a highly imbalanced distribution of activations if we would not compensate for the drop out rate resp. w/ keep-prob. This should also become apparent in a bad training performance.
Here you can find some further useful explanation:
- Dropout in Neural Networks. Dropout layers have been the go-to… | by Harsh Yadav | Towards Data Science
- Do you see regularization the way I do? - #2 by Christian_Simonis
Hope that helps!
Best regards
Christian