If I understood correctly dropout layers randomly dropout neurons so that the network doesn’t rely on just a few features, it’s like training your right right arm in the gym but leaving your left arm untrained.
So my question is: has anyone experimented with dropping out neurons not in a random manner but specifically choosing the neurons with high activations since those are the “right arm”?
The point is that it is random per neuron and per training iteration. If you want a physical analogy, rather than thinking of a person with two arms, it might be better to think of an octopus and on each iteration you use only 6 or 7 of the arms, but it’s a different choice of which exact arms on each training iteration. So all the “arms” get strengthened, but by slightly less than they otherwise would be. Or maybe they get strengthened in a more balanced way than they would be without the dropout.
The rationale behind standard dropout is to prevent the model from over-relying on certain features, so your proposed approach seems like it could be an effective way to achieve that more directly.