SloATOMIC 2020 Data Set

The SloATOMIC 2020 contains the Slovene translated examples of the ATOMIC 2020 data set. The translation was done using the DeepL translation service.

The main purpose of the data set is to train Slovene commonsense reasoning models.

🗃️ Data

The data set is publically available via the clarin.si repository.

The data set is available in the data folder which contains the following files:

sloatomic_train.tsv: The training set.
sloatomic_dev.tsv: The development set.
sloatomic_test.tsv.automatic_all: The test set containing all of the automatically translated examples.
sloatomic_test.tsv.automatic_10k: The selection of 10k examples from the complete test set.
sloatomic_test.tsv.manual_10k: The manually inspected and fixed examples of the automatic 10k subset.

Data Format

The data is in the tsv (tab-separated) format. Each line contains one example. The columns are:

head_event: The head event of the example.
relation: The relation between the head event and the tail event.
tail_event: The tail event of the example.

📚 Papers

The data set was used in the following papers:

SLOmet - Slovenian Commonsense Description. Adrian Mladenić Grobelnik, Erik NOvak, Dunja Mladenić, Marko Grobelnik SiKDD Slovenian KDD Conference, 2022.

🔎 Reference

If the data set was used for your research, please provide the following reference:

 @misc{11356/1724,
   title = {Slovene Translation of the Atomic 2020 data set {SloATOMIC} 2020},
   author = {Mladeni{\'c} Grobelnik, Adrian and Novak, Erik and Mladeni{\'c}, Dunja and Grobelnik, Marko},
   url = {http://hdl.handle.net/11356/1724},
   note = {Slovenian language resource repository {CLARIN}.{SI}},
   copyright = {Creative Commons - Attribution-{ShareAlike} 4.0 International ({CC} {BY}-{SA} 4.0)},
   issn = {2820-4042},
   year = {2022} 
 }

📣 Acknowledgments

This work is developed by Department of Artificial Intelligence at Jozef Stefan Institute.

The work is supported by the Slovenian Research Agency and the RSDO project.

⚖️ License

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
data		data
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SloATOMIC 2020 Data Set

🗃️ Data

Data Format

📚 Papers

🔎 Reference

📣 Acknowledgments

⚖️ License

About

Releases 1

Packages

License

E3-JSI/dataset-SloATOMIC-2020

Folders and files

Latest commit

History

Repository files navigation

SloATOMIC 2020 Data Set

🗃️ Data

Data Format

📚 Papers

🔎 Reference

📣 Acknowledgments

⚖️ License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Packages