What is a good Rouge/BLEU score

aniryudh · July 4, 2023, 8:26am

As title suggests, what’s a good rouge/bleu score to aim for? has there been any studies done to correlate the scores to how human perceive the output?

Atharva_Divekar · July 4, 2023, 10:46am

Rouge and Bleu scores are commonly used metrics for evaluating the quality of text generation models. While there is no specific score that can be considered “good” universally, higher scores generally indicate better performance.

As for the correlation between these scores and human perception, studies have shown that there is a positive correlation between higher Rouge and Bleu scores and human perception of quality. However, it’s important to note that these metrics only evaluate certain aspects of text quality, such as precision and recall, and may not capture more nuanced aspects of language use that humans may perceive as important.
Thank you for posting. Happy Learning!

Juan_Olano · July 4, 2023, 12:41pm

To add to @Atharva_Divekar’s great and clear answer and as a means of documentation for future learners: Both scores go from 0 to 1, where 0 is no overlap and 1 is perfect match.

Another item to remember is:
Bleu is primarily used to grade the quality of text translations.
Rouge is primarily used to grade the quality of text summarizations.

Other uses may apply, but these are the primarily uses.

Thanks!

Juan

Topic		Replies	Views
ROUGE and BLEU metrics Generative AI with Large Language Models week-2	4	530	September 18, 2024
Bleu and rouge scores NLP with Attention Models week-1	2	626	January 16, 2023
Are BLEU and ROUGE intrinsic or extrinsic evaluation measures? NLP with Attention Models week-1	4	734	April 11, 2023
Model evaluation \| BLEU \| Translation? Generative AI with Large Language Models week-2	2	298	December 17, 2023
Week 2 - Fine-tuning and LLM Evaluation in practice Generative AI with Large Language Models week-2	1	429	July 27, 2023

What is a good Rouge/BLEU score

Related topics