CRoW: Benchmarking Commonsense Reasoning in Real-World Tasks

Paper | Website | Leaderboard | Download data

CRoW is a multi-task benchmark to evaluate commonsense reasoning ability of NLP systems in solving real-world tasks where this ability is required.

This repo contains the code used to build CRoW benchmark and evaluate models on it. If you would like to download the data for this benchmark and evaluate your own models on it, please check out the Tasks section. We also keep an active leaderboard for this benchmark and you can contribute to it by following the Getting Started guide.

For more information on this benchmark, check the website.

Citation

@inproceedings{ismayilzada2023crow,
    title={CRoW: Benchmarking Commonsense Reasoning in Real-World Tasks},
    author={Mete Ismayilzada and Debjit Paul and Syrielle Montariol and Mor Geva and Antoine Bosselut},
    booktitle={Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
    year={2023}
}

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
.github/workflows		.github/workflows
acl_crawl		acl_crawl
evaluation		evaluation
mturk		mturk
website		website
.gitignore		.gitignore
.ruby-version		.ruby-version
Gemfile		Gemfile
README.md		README.md
requirements.txt		requirements.txt
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CRoW: Benchmarking Commonsense Reasoning in Real-World Tasks

Citation

About

Releases

Packages

Languages

epfl-nlp/crow

Folders and files

Latest commit

History

Repository files navigation

CRoW: Benchmarking Commonsense Reasoning in Real-World Tasks

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages