Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix mmlu bpb bug only scoring answer=A questions #718

Merged
merged 1 commit into from
Sep 6, 2024

Conversation

OyvindTafjord
Copy link
Contributor

Same problem as fixed for oe-eval tasks in #712, I forgot there was separate handling for MMLU.

With the previous code, only questions with gold answer A would be counted in the bpb evaluations, now they should all be counted.

@liujch1998
Copy link

Thanks Oyvind! Approving this PR.

Though it seems to me that not resetting label to 0 is fine for MMLU — MMLU’s prep_examples(), which is inherited from ICLMultiChoiceTaskDataset, does not skip cases where label_id and cont_id mismatch when metric=bpb (whereas OEEvalTask does), and ICLMetric.compute() does not use label_id when metric=bpb. So questions with any label_id should have already been included in the metric computation.

@OyvindTafjord OyvindTafjord merged commit 0b92077 into main Sep 6, 2024
10 of 12 checks passed
@OyvindTafjord OyvindTafjord deleted the ot-fix-mmlu-bpb branch September 6, 2024 17:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants