LC-QuAD v1.0 and v2.0 are large-scale QA datasets towards complex questions against knowledge graphs.
The Largescale Complex Question Answering Dataset 1.0 (LC-QuAD 1.0)[1] is a Question Answering dataset with 5000 pairs of question and its corresponding SPARQL query. The target knowledge base is DBpedia, specifically, the April, 2016 version. Please see the original paper for details about the dataset creation process and framework.
This dataset can be downloaded via the link.
Year | Type | F1 | Acc | Reported by | Official Repo |
---|---|---|---|---|---|
2021 | SP-based | 71.8 | - | Zheng et. al. | - |
2020 | SP-based | 74.8 | - | Chen et. al. | Repo |
2019 | IR-based | 33.0 | - | Zheng et. al. | Repo |
2018 | SP-based | 75.0 | - | Zafar et. al. | Repo |
The Largescale Complex Question Answering Dataset 2.0 (LC-QuAD 2.0)[2] is a Large Question Answering dataset with 30,000 pairs of question and its corresponding SPARQL query. The target knowledge base is Wikidata and DBpedia, specifically the 2018 version. Please see our paper for details about the dataset creation process and framework.
This dataset can be downloaded via the link.
Year | Type | F1 | Acc | Reported by | Official Repo |
---|
[1] Trivedi, Priyansh, Gaurav Maheshwari, Mohnish Dubey, and Jens Lehmann. Lc-quad: A corpus for complex question answering over knowledge graphs. In International Semantic Web Conference, pp. 210-218. Springer, Cham, 2017.
[2] Dubey, Mohnish, Debayan Banerjee, Abdelrahman Abdelkawi, and Jens Lehmann. Lc-quad 2.0: A large dataset for complex question answering over wikidata and dbpedia. In International semantic web conference, pp. 69-78. Springer, Cham, 2019.