C3_W1 Why use the Gaussian distribution

rmwkwok · September 9, 2022, 1:00am

Hello @Andromeda18, when we build our next anomaly detection system, it’s our job to verify that the sample’s distribution matches with our model assumption. For example, in the video, we assumed the samples to be gaussian distributed on each feature, and we assumed independence among features.

For whether or not it is gaussian distributed on a feature, quantitatively speaking, we can measure it by using method like the Kolmogorov–Smirnov test, and qualitatively, for example, we can examine whether the sample generation process on that feature dimension is an additive process. An example is the distribution of environmental vocal noise level is likely to be gaussian because the noise level is an addition of various noise sources which can be a car passing-by, pedestrian talking on phone or to each other, construction work, and so on. While any of these can be non-gaussian, the addition of them will become gaussian according to the central limit theorem.

Many processes are additive, so the gaussian distribution is a pretty popular choice for modeling a random variable.

Raymond

Topic		Replies	Views
Anomaly Detection with Different Probability Distributions Unsupervised Learning, Recommenders, Reinforcement week-1	4	656	February 16, 2023
Why are non-Gaussian features not ideal for anomaly detection? Unsupervised Learning, Recommenders, Reinforcement week-1	1	437	June 15, 2023
Multivariate normal distribution vs Gaussian Mixture Models Unsupervised Learning, Recommenders, Reinforcement week-1	1	557	August 30, 2022
Categorical variables in anomaly detection Unsupervised Learning, Recommenders, Reinforcement week-1	4	617	September 22, 2022
Anomaly detection lab on fitting Gaussian distribution Unsupervised Learning, Recommenders, Reinforcement week-1	2	510	August 11, 2022

C3_W1 Why use the Gaussian distribution

Related topics