NL2LS: LLM-based Automatic Linking of Knowledge Graphs

Can large language models (LLMs) recognize link specifications (LS)? We hypothesize that LS can be considered a low-resource language. In this project, we tackle the problem of automating the generation of LS from natural language (NL) using LLMs, a task we refer to as NL2LS.

Overview

NL2LS is a novel framework that automates the generation of link specifications (LS) from English and German natural language inputs using LLMs.

We address this task using:

Rule-based methods (regex-based)
Zero-shot learning via prompting
Supervised fine-tuning of LLMs

Models Used

We experimented with the following model families:

T5: Encoder-Decoder architecture
LLaMA-3: Decoder-only architecture
LOLA: Decoder with Mixture-of-Experts (MoE) layers

We also evaluated GPT and Mistral models during early development.
However, due to their architectural similarity to LLaMA (i.e., decoder-only), and to avoid redundancy, we decided not to include them in the final experiments.

Datasets

We used and extended multiple benchmark datasets:

Dataset	Description
LIMES-Silver	Auto-generated NL–LS pairs
LIMES-Annotated	Human-verified LS verbalizations
SILK-Annotated	Based on SILK link discovery framework
LIMES-Geo-Temporal	Contains geospatial and temporal LSs
German Counterparts	Translations of NLs into German

Evaluation & Results

We evaluated all models using:

BLEU
METEOR
ChrF++
TER

Key Results:

LOLA and LLaMA (fine-tuned) achieved BLEU scores up to 98.8 on English datasets.
LOLA showed excellent generalization, with >95 BLEU on German test sets.

Environment and Dependencies

- Ubuntu 10.04.2 LTS
- Python ≥ 3.8
- torch ≥ 1.7.0

Instalation

To run NL2LS locally or train/fine-tune models, download the NL2LS repository.

Recommended: Create a virtual environment

python3 -m venv venv
source venv/bin/activate

Install dependencies:

pip install -r requirements.txt

Name		Name	Last commit message	Last commit date
Latest commit History 90 Commits
datasets_DE		datasets_DE
datasets_EN		datasets_EN
raw-resources		raw-resources
script		script
.gitignore		.gitignore
Appendix.pdf		Appendix.pdf
Figure.drawio.png		Figure.drawio.png
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

NL2LS: LLM-based Automatic Linking of Knowledge Graphs

Overview

Models Used

Datasets

Evaluation & Results

Key Results:

Environment and Dependencies

Instalation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 4

Uh oh!

Languages

dice-group/NL2LS

Folders and files

Latest commit

History

Repository files navigation

NL2LS: LLM-based Automatic Linking of Knowledge Graphs

Overview

Models Used

Datasets

Evaluation & Results

Key Results:

Environment and Dependencies

Instalation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 4

Uh oh!

Languages

Packages