Add new Arabic benchmarks (5) and enhance existing tasks (#372) · huggingface/lighteval@de8dba3

Commit

Add new Arabic benchmarks (5) and enhance existing tasks (#372)

* Update arabic_evals.py

Add new Arabic benchmarks and update existing tasks

- Renamed `arabic_mmlu` to `arabic_mmlu_mt` to highlight its machine-translated origin.
- Added new benchmarks: `arabic_mmlu` ArabicMMLU (https://arxiv.org/abs/2402.12840), `arabic_mmlu_ht` (human-translated), and `MadinahQA` from MBZUAI. As well as `arabic_mmmlu` (OpenAI MMMLU), and `AraTrust` a trustworthiness benchmark for Arabic LLMs (https://arxiv.org/abs/2403.09017).
- Enhanced prompt functions for better flexibility in answer options.

* Update and rename OALL_tasks.txt to OALL_v1_tasks.txt

Rename file to refelect that it is v1 leaderboard tasks

* Create OALL_v2_tasks.txt

Tasks for v2 of OALL

* Update all_arabic_tasks.txt

add new and renamed tasks

* Update arabic_evals.py

Fix formatting issues for

* Update all_arabic_tasks.txt

Add missing task: OpenAI's MMMLU arabic subset

* Update all_arabic_tasks.txt

Correct order

* Update arabic_evals.py

remove openai mmmlu task following the discussion here: #372

* Update all_arabic_tasks.txt

remove openai mmmlu task following the discussion here: #372

* Update tasks.py

Adding a templated version of arabic mmlu based on @hynky1999 request in the #372 PR

* Update tasks.py

remove arabic_mmlu_templated_tasks

---------

Co-authored-by: Clémentine Fourrier <[email protected]>
Co-authored-by: Nathan Habib <[email protected]>

Loading branch information

3 people authored Dec 11, 2024

1 parent 6ad7276 commit de8dba3

0 comments on commit `de8dba3`

Please sign in to comment.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Commit

There are no files selected for viewing

0 comments on commit `de8dba3`

Commit

There are no files selected for viewing

0 comments on commit de8dba3

0 comments on commit `de8dba3`