In the video “Anomaly detection vs. supervised learning”, the instructor said that a range of positive examples that characterize a few examples would be between 0 and 20. But I consider that using absolute numbers is not very enlightening, taking into account the different sizes of datasets that exist in real applications. Is there a percentage value we could consider instead?
A couple of thoughts:

If you can measure the proportion of anomalous examples as a percentage (i.e. > 1%) then you may have enough examples to use supervised learning.

The reason he says 20 examples is that this is a statistically large enough number to provide useful results.

Anomaly detection uses the “nominal” examples to create a statistical model of what is normal behavior. It doesn’t use the anomalies for statistics  only for setting a threshold. In practice 20 examples seems to be enough for useful metrics in many situations.