Can large language models (LLMs) recognize link specifications (LS)? We hypothesize that LS can be considered a low-resource language. In this project, we tackle the problem of automating the generation of LS from natural language (NL) using LLMs, a task we refer to as NL2LS.
NL2LS is a novel framework that automates the generation of link specifications (LS) from English and German natural language inputs using LLMs.
We address this task using:
- Rule-based methods (regex-based)
- Zero-shot learning via prompting
- Supervised fine-tuning of LLMs
We experimented with the following model families:
- T5: Encoder-Decoder architecture
- LLaMA-3: Decoder-only architecture
- LOLA: Decoder with Mixture-of-Experts (MoE) layers
We also evaluated GPT and Mistral models during early development.
However, due to their architectural similarity to LLaMA (i.e., decoder-only), and to avoid redundancy, we decided not to include them in the final experiments.
We used and extended multiple benchmark datasets:
Dataset | Description |
---|---|
LIMES-Silver | Auto-generated NL–LS pairs |
LIMES-Annotated | Human-verified LS verbalizations |
SILK-Annotated | Based on SILK link discovery framework |
LIMES-Geo-Temporal | Contains geospatial and temporal LSs |
German Counterparts | Translations of NLs into German |
We evaluated all models using:
- BLEU
- METEOR
- ChrF++
- TER
- LOLA and LLaMA (fine-tuned) achieved BLEU scores up to 98.8 on English datasets.
- LOLA showed excellent generalization, with >95 BLEU on German test sets.
- Ubuntu 10.04.2 LTS
- Python ≥ 3.8
- torch ≥ 1.7.0
To run NL2LS locally or train/fine-tune models, download the NL2LS repository.
Recommended: Create a virtual environment
python3 -m venv venv
source venv/bin/activate
Install dependencies:
pip install -r requirements.txt