Skip to content

Code to run experiments presented in the paper presented at VarDial 2024 by Verkijk, Sommerauer and Vossen

Notifications You must be signed in to change notification settings

StellaVerkijk/VarDial2024

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

VarDial2024

Screenshot 2024-06-11 at 14 15 58

This repository contains code and data to run experiments discussed in the paper presented at VarDial, NAACL 2024 (June), by Verkijk, Sommerauer and Vossen, as well as a collection of annotated data presented in the same paper (Studying Language Variation Considering the Re-Usability of Modern Theories, Tools and Resources for Annotating Explicit and Implicit Events in Cnturies Old Text).This work is part of the GLOBALISE project.

annotated_data

This folder contains all annotated data collected thus far for event detection and classifcation within GLOBALISE.

  • train
    • train_2 Documents annotated by trained annotators in Round 2 as described in the paper - 54 pages
    • train_3 Documents annotated in Round 3 as described in the paper - 57 pages
  • test
    • curated One document annoated and subsequently curated by four historians and a linguist - 5 pages
    • non-curated Two documents, one annotated in Round 2 and one in Round 3, annotated by two and four annotator teams respectively, that are to be curated to serve as an addition to the test set. 13 pages

The documents included in non-curated are also those used for calculating the IAA. The documents included in train_2 are also used in the LLM-finetuning experiment, where this data is split in train and test.

Overview w/ metadata: Screenshot 2024-06-17 at 01 36 45

zero-shot_experiments

This folder contains code and data to reproduce the zero-shot experiments presented in the paper.

LLM-finetuning

This folder contains code and data to fine-tune (L)LMs on the task of event detection as described in the paper. The code was written to be run on an HPC cluster. The original code written for NER is by Sophie Arnoult.

Screenshot 2024-06-12 at 19 18 16

About

Code to run experiments presented in the paper presented at VarDial 2024 by Verkijk, Sommerauer and Vossen

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published