A small improvement in `metrics_sample.py::ROUGE` #217

sadra-barikbin · 2024-07-07T18:02:36Z

Hi there!

To fix a tiny bug and do a small improvement to ROUGE class.

NathanHB · 2024-07-09T12:08:00Z

src/lighteval/metrics/metrics_sample.py

@@ -323,6 +323,7 @@ def __init__(
        normalize_gold: callable = None,
        normalize_pred: callable = None,
        aggregation_function: callable = None,
+        tokenizer: object = None,


You could use the tokenizer object here transformers.PreTrainedTokenizer

NathanHB · 2024-07-09T12:08:56Z

Hi ! Thanks for the PR, could you describe the bug you encoutered and how adding the tokenizer to the rouge function solves it ?

sadra-barikbin · 2024-07-09T12:21:18Z

@NathanHB , One might want to use a tokenizer other than the rouge_score's default one. In my case, as the default one doesn't support Persian, i.e. removes non-latin characters, I attempted to use another tokenizer.

rouge_1 = SampleLevelMetric(
    metric="custom_rouge1",
    sample_level_fn=ROUGE("rouge1", tokenizer=nltk.tokenize.SpaceTokenizer()).compute,
    category=MetricCategory.GENERATIVE,
    use_case=MetricUseCase.SUMMARIZATION,
    corpus_level_fn=np.mean,
    higher_is_better=True,
)
extend_enum(Metrics, "custom_rouge1", rouge_1)

NathanHB · 2024-07-09T12:47:39Z

Oh that's great then. Just need to make the tests pass and it should be good to merge :)

Do the impr.

c4aa009

NathanHB reviewed Jul 9, 2024

View reviewed changes

sadra-barikbin added 2 commits July 17, 2024 13:56

Merge branch 'main' into small-improvement-in-rouge

7668844

Apply ruff to metrics_sample.py

5c7f67d

sadra-barikbin force-pushed the small-improvement-in-rouge branch from b73ad9c to 5c7f67d Compare July 17, 2024 10:31

sadra-barikbin requested a review from NathanHB July 17, 2024 10:33

clefourrier and others added 6 commits July 17, 2024 16:00

Merge branch 'main' into small-improvement-in-rouge

051f778

Merge branch 'main' into small-improvement-in-rouge

5932c05

Merge branch 'main' into small-improvement-in-rouge

98529a4

Merge branch 'main' into small-improvement-in-rouge

0197d90

Merge branch 'main' into small-improvement-in-rouge

be6fa91

Merge branch 'main' into small-improvement-in-rouge

d66e568

NathanHB approved these changes Aug 14, 2024

View reviewed changes

NathanHB merged commit b54c189 into huggingface:main Aug 14, 2024
2 checks passed

sadra-barikbin deleted the small-improvement-in-rouge branch August 14, 2024 19:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A small improvement in `metrics_sample.py::ROUGE` #217

A small improvement in `metrics_sample.py::ROUGE` #217

sadra-barikbin commented Jul 7, 2024

NathanHB Jul 9, 2024

NathanHB commented Jul 9, 2024

sadra-barikbin commented Jul 9, 2024 •

edited

Loading

NathanHB commented Jul 9, 2024

A small improvement in metrics_sample.py::ROUGE #217

A small improvement in metrics_sample.py::ROUGE #217

Conversation

sadra-barikbin commented Jul 7, 2024

NathanHB Jul 9, 2024

Choose a reason for hiding this comment

NathanHB commented Jul 9, 2024

sadra-barikbin commented Jul 9, 2024 • edited Loading

NathanHB commented Jul 9, 2024

A small improvement in `metrics_sample.py::ROUGE` #217

A small improvement in `metrics_sample.py::ROUGE` #217

sadra-barikbin commented Jul 9, 2024 •

edited

Loading