Threshold of positive examples

Amanda_dos_Santos · February 11, 2024, 4:53am

In the video “Anomaly detection vs. supervised learning”, the instructor said that a range of positive examples that characterize a few examples would be between 0 and 20. But I consider that using absolute numbers is not very enlightening, taking into account the different sizes of datasets that exist in real applications. Is there a percentage value we could consider instead?

TMosh · February 11, 2024, 5:23am

A couple of thoughts:

If you can measure the proportion of anomalous examples as a percentage (i.e. > 1%) then you may have enough examples to use supervised learning.
The reason he says 20 examples is that this is a statistically large enough number to provide useful results.
Anomaly detection uses the “nominal” examples to create a statistical model of what is normal behavior. It doesn’t use the anomalies for statistics - only for setting a threshold. In practice 20 examples seems to be enough for useful metrics in many situations.

Topic		Replies	Views
Anomaly algorithm - video difference Unsupervised Learning, Recommenders, Reinforcement week-module-1	6	32	July 10, 2024
Anomaly detection practice quiz Unsupervised Learning, Recommenders, Reinforcement week-module-1	1	557	August 3, 2022
Y_val in anomoly detection lab Unsupervised Learning, Recommenders, Reinforcement week-module-1	1	466	April 16, 2023
Anomaly Detection Practice Lab - labeled data Unsupervised Learning, Recommenders, Reinforcement week-module-1	1	491	February 18, 2023
Difference between Anomaly detection and classification Unsupervised Learning, Recommenders, Reinforcement week-module-1	3	627	July 28, 2022

Threshold of positive examples

Related topics