Why we can’t simultaneously employ L2 regularization as well as drop out method while training our neural network?
You can! Adding a penalty term to the loss function and dropping units are not mutually exclusive
It’s even mentioned in the Dropout paper:
Okay Thank you. Is there any definite answer for which regularization technique works best in general for very deep neural networks?
I think you’ll have to find out what works best for your problem