The lectures on anomaly detection discuss transforming input data to get it to closely resemble the Gaussian probability distribution. Could we also consider fitting other probability distributions if we cannot transform the data to look like Gaussian? In practice, are other distributions used? One example I am thinking of is reliability analysis for time to failure with distributions such as Weibull or Exponential.
My first thoughts on this are that the gaussian distribution is a distribution that models naturally occuring phenomena. In fact if you notice in nature, things tend to be in balance, and so this distribution is spread around a balanced (middle) point. This makes it a good candidate for natural phenomena. Other distributions could be used as well if they describe a phenomena better, but i doubt that any, at least natural phenomena is not balanced in someway. Have a look on this link too Normal distribution
I agree with you because that is the point of anomaly detection, detect those out of the naturally occurring phenomena
The normal distribution has been studied well enough with established metrics such as z-score. So, it becomes easier to assess that a sample that is 3 \sigma away from the mean would be a rare occurence.
If we have similar established standards for other distributions, then by all means we could use those as well for anamoly detection.
This is a really good question!
When it come to anomaly or black swan events, in fact many phenomena are not following a Gaussian distribution, but they have havier tails (e.g. like expected returns in the stock market or medical health indicator data).
So using fat tail distributions like student t-distribution can absolutely work for anomaly detection and often it makes much sense since especially the long tail events come with significant costs if not detected (the cost of the false negatives). Here is a paper using generalised student t approach for anomaly detection which could be interesting to take a look at: https://people.cs.vt.edu/~clu/Publication/2013/AAAI-Lu-2013.pdf
Another note on anomaly detection models, e.g. vanilla variational autoencoders rely on a Gaussian prior in the latent space in general but there are also extensions discussed in literature e.g. to fat tail distributions like student t distribution: [2004.02581] Variational auto-encoders with Student's t-prior
Hope that helps, @Terry_Green!
Best regards
Christian