Why are non-Gaussian features not ideal for anomaly detection?

Just watched the video on transforming non-Gaussian features to be Gaussian when building anomaly detection systems. What is the logic and intuition behind this?

Because you are using the gaussian distribution to model the detection algorithm, if anything falls outside it is an anomaly.