This is a fork of https://github.com/stanford-crfm/helm which we used for the 2023 NeurIPS LLM efficiency competition https://llm-efficiency-challenge.github.io/
It was private because the tasks we were testing on had to be undisclosed to the final participants and included
- Math
- Corr2cause
- Justice
- Samsum
- Ethics
If you're interested in using these tasks in your own work please feel free to copy paste