ROUGE score - what does it refer to - Recall / Precision / F1 or something else?

BD2023 · January 8, 2024, 11:31am

In the benchmarks, when we see ROUGE scores (ROUGE-1, ROUGE-2, ROUGE-L), what is the actual underlying score? Recall or Precision or F1 or some other transformation of some combination of these values?

gent.spah · January 8, 2024, 3:24pm

This is an article that explains each one of them and their respective formulas with regards to precision, recall and F1. To go straight to the endpoint:

Mean of the F1 scores will gives us the full ROUGE-1 score for dataset. (similar to ROUGE-2…)

Nevermnd · August 8, 2024, 4:52pm

@gent.spah I know this question has been answered, but I tried looking for other posts that get at this inquiry, but figured rather than just starting a new one it might be good to add on here.

So… I was wondering if you had any ‘intuition’:

I mean in traditional ML we calculate precision and recall as follows:

But both from the lecture and your linked article precision and recall are calculated like this:

While BLEU is not too bad, I’m going to have to sit and think and go over ROUGE a few more times to make sure I get that.

However, my question actually relates to-- I know the context is different, but how are these versions of recall and precision equivalent ?

Or are they being used ‘in a different way’ ?

… It is at least not ‘obviously’ jumping out at me…

gent.spah · August 9, 2024, 6:38am

I think this is it, the principle of the formula is the same just the application is different!

Nevermnd · August 9, 2024, 10:13am

OIC, so still:

Precision = the proportion of positive identifications that were actually correct

and

Recall = the proportion of actual positives that were correctly identified by the model

So, I I guess ‘same folks, different strokes’ (i.e. formulae).

Topic		Replies	Views
Bleu and rouge scores NLP with Attention Models week-1	2	627	January 16, 2023
ROUGE-L Calculation in the lecture : "Model Evaluation" of Week-2 Generative AI with Large Language Models week-2	4	493	December 15, 2023
Error in video's ROUGE-1 `precision` calculation Generative AI with Large Language Models week-2	2	17	February 10, 2025
UNQ_C8 - Rouge similarity NLP with Attention Models week-1	2	662	May 21, 2022
C4W1-Exercise 6 - rouge1_similarity NLP with Attention Models week-1	1	31	September 4, 2024

ROUGE score - what does it refer to - Recall / Precision / F1 or something else?

Related topics