Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A small improvement in metrics_sample.py::ROUGE #217

Merged
merged 9 commits into from
Aug 14, 2024

Conversation

sadra-barikbin
Copy link
Contributor

Hi there!

To fix a tiny bug and do a small improvement to ROUGE class.

@@ -323,6 +323,7 @@ def __init__(
normalize_gold: callable = None,
normalize_pred: callable = None,
aggregation_function: callable = None,
tokenizer: object = None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could use the tokenizer object here transformers.PreTrainedTokenizer

@NathanHB
Copy link
Member

NathanHB commented Jul 9, 2024

Hi ! Thanks for the PR, could you describe the bug you encoutered and how adding the tokenizer to the rouge function solves it ?

@sadra-barikbin
Copy link
Contributor Author

sadra-barikbin commented Jul 9, 2024

@NathanHB , One might want to use a tokenizer other than the rouge_score's default one. In my case, as the default one doesn't support Persian, i.e. removes non-latin characters, I attempted to use another tokenizer.

rouge_1 = SampleLevelMetric(
    metric="custom_rouge1",
    sample_level_fn=ROUGE("rouge1", tokenizer=nltk.tokenize.SpaceTokenizer()).compute,
    category=MetricCategory.GENERATIVE,
    use_case=MetricUseCase.SUMMARIZATION,
    corpus_level_fn=np.mean,
    higher_is_better=True,
)
extend_enum(Metrics, "custom_rouge1", rouge_1)

@NathanHB
Copy link
Member

NathanHB commented Jul 9, 2024

Oh that's great then. Just need to make the tests pass and it should be good to merge :)

@sadra-barikbin sadra-barikbin force-pushed the small-improvement-in-rouge branch from b73ad9c to 5c7f67d Compare July 17, 2024 10:31
@sadra-barikbin sadra-barikbin requested a review from NathanHB July 17, 2024 10:33
@NathanHB NathanHB merged commit b54c189 into huggingface:main Aug 14, 2024
2 checks passed
@sadra-barikbin sadra-barikbin deleted the small-improvement-in-rouge branch August 14, 2024 19:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants