Skip to content

Latest commit

 

History

History
12 lines (7 loc) · 854 Bytes

README.md

File metadata and controls

12 lines (7 loc) · 854 Bytes

scientific_authorship

Contains dataset of scientific articles from ACL and EMNLP up to 2018.

Articles contents are organized in xml format, as resulted by extraction from the corresponding PDFs using Grobid. Data is organized separately in article contents, references and headers corresponding to each of the articles in the dataset.

Copyright

This data is distributed under a Creative Commons License. When using this data in your research, please reference the following publication:

Caragea, Cornelia, Ana Uban, and Liviu P. Dinu. "The Myth of Double-Blind Review Revisited: ACL vs. EMNLP." In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 2317-2327. 2019.