C1W2 Metrics for skewed datasets: Precision or Recall

giovanni · April 17, 2022, 5:49am

As discussed in C1W1, precision and recall are used for evaluating ML model performance on skewed datasets.

I have been using these two metrics as well, but had a discussion recently about recall being a prevalence-independent statistic. In my understanding, recall is more consistent than precision. When we evaluate the ML model using several test sets, each of which with a different distribution of class label y, the recalls on these test sets are rather similar, but the precisions aren’t.

Is my understanding correct?

Thanks.

kiavash_fathi · April 24, 2022, 3:06pm

Dear Giovanni,

Thank you very much for your question. As you already know, precision is calculated as:
True positive / (True positive + False positive)
We can use this measurement when the cost of false positives is high(er). As an example, in a spam filtering program, if our classifier classifies important emails of a user as spam, the user will lose important emails and thus false positives have a much higher cost for us.
On the other hand, recall is calculated as:
True positive / (True positive + False negative)
thus by employing recall we are trying to penalize false negatives. Just imagine that we are developing a model which tries to detect patients who suffer from cancer. In this case, classifying someone who is suffering from cancer as a healthy person has an extremely high cost, thus here we use recall instead of precision.
In both cases the classes are unbalanced and the correct use of a metric is highly dependent on our use-case.
If you are merely try to have a better understanding of your classifier trained with an unbalanced dataset you could also use True positive or negative rate. Using this link you can read more about it

Have a lovely day!
Kiavash

Topic		Replies	Views
C2W2_feature selection Machine Learning Data Lifecycle in Production	1	546	April 24, 2022
#Week3 - Skewed datasets - prevision/recall metrics Advanced Learning Algorithms week-3	2	247	February 20, 2024
Confused b/w Precision and recall Advanced Learning Algorithms week-2	3	833	January 8, 2023
Doubt regarding skewed datasets Advanced Learning Algorithms week-3	1	424	June 5, 2023
A question about high precision and hign recall Advanced Learning Algorithms week-2	5	477	March 27, 2023

C1W2 Metrics for skewed datasets: Precision or Recall

Related topics