Skip to content

Can large language models (LLMs) recognize link specifications (LS)? We hypothesize that LS can be considered a low-resource language. In our previous work, we explored how LS can be translated into natural language (NL). Currently, we are investigating the reverse process by translating NL into LS, which we refer to as NL2LS.

Notifications You must be signed in to change notification settings

dice-group/NL2LS

Repository files navigation

NL2LS: LLM-based Automatic Linking of Knowledge Graphs

Can large language models (LLMs) recognize link specifications (LS)? We hypothesize that LS can be considered a low-resource language. In this project, we tackle the problem of automating the generation of LS from natural language (NL) using LLMs, a task we refer to as NL2LS.

NL2LS Architecture


Overview

NL2LS is a novel framework that automates the generation of link specifications (LS) from English and German natural language inputs using LLMs.

We address this task using:

  • Rule-based methods (regex-based)
  • Zero-shot learning via prompting
  • Supervised fine-tuning of LLMs

Models Used

We experimented with the following model families:

  • T5: Encoder-Decoder architecture
  • LLaMA-3: Decoder-only architecture
  • LOLA: Decoder with Mixture-of-Experts (MoE) layers

We also evaluated GPT and Mistral models during early development.
However, due to their architectural similarity to LLaMA (i.e., decoder-only), and to avoid redundancy, we decided not to include them in the final experiments.


Datasets

We used and extended multiple benchmark datasets:

Dataset Description
LIMES-Silver Auto-generated NL–LS pairs
LIMES-Annotated Human-verified LS verbalizations
SILK-Annotated Based on SILK link discovery framework
LIMES-Geo-Temporal Contains geospatial and temporal LSs
German Counterparts Translations of NLs into German

Evaluation & Results

We evaluated all models using:

  • BLEU
  • METEOR
  • ChrF++
  • TER

Key Results:

  • LOLA and LLaMA (fine-tuned) achieved BLEU scores up to 98.8 on English datasets.
  • LOLA showed excellent generalization, with >95 BLEU on German test sets.

Environment and Dependencies

- Ubuntu 10.04.2 LTS
- Python ≥ 3.8
- torch ≥ 1.7.0

Instalation

To run NL2LS locally or train/fine-tune models, download the NL2LS repository.

Recommended: Create a virtual environment

python3 -m venv venv
source venv/bin/activate

Install dependencies:

pip install -r requirements.txt

About

Can large language models (LLMs) recognize link specifications (LS)? We hypothesize that LS can be considered a low-resource language. In our previous work, we explored how LS can be translated into natural language (NL). Currently, we are investigating the reverse process by translating NL into LS, which we refer to as NL2LS.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •