Reducing Omission Errors in Machine Translation with Prompt-Contrastive Decoding

Important:

All Python scripts (scripts/, translation_models/, and any script ending with .py.) are set up for using prompt-contrastive decoding with user message only.

For system prompt experiments, please use the appropriate notebook in the notebooks/ directory.

Overview

This repository builds on and extends Sennrich et al. (EACL 2024)'s codebase to support Llama 3.1 models and prompt-contrastive decoding, as part of Xiaojing Zhang’s master’s thesis at the University of Zurich. The goal is to reduce omission errors in low-resource machine translation with large language models (LLMs) through prompt-based decoding techniques.

Main Features and Changes

Support for Llama 3.0 and 3.1, including chat-based prompting
Implementation of prompt-contrastive decoding
Adapted scripts and new notebooks for Colab compatibility

Relation to Original Codebase

All core logic and the original implementation are credited to Sennrich et al. (EACL 2024). This fork was extended and maintained by Xiaojing Zhang for the master’s thesis at the University of Zurich.

This project is based on the source-contrastive and language-contrastive decoding framework as described in Sennrich et al. (EACL 2024):

Source-contrastive decoding: Search for a translation that maximizes P(Y|X) - λ·P(Y|X'), where X' is a random source segment. This penalizes hallucinations.
Language-contrastive decoding: Search for a translation that maximizes P(Y|X,l_y) - λ·P(Y|X,l_y'), where l_y is the language indicator for the desired target language, and l_y' the indicator for an undesired language (such as English or the source language). This penalizes off-target translations.
Prompt-contrastive decoding (this work): Search for a translation that maximizes P(Y|X, p_pos) – λ·P(Y|X, p_neg), where p_pos is the positive prompt that encourages desired translation behavior and p_neg the negative prompt inducing undesired translation behavior such as omissions. This penalizes omissions.

Main modifications in this repository:

llama.py:
1. Added pad token, set padding side, and defined EOS token IDs for Llama 3.1 models
2. Used role-based chat template and tokenizer's apply_chat_template method for Llama 3.1 models
3. Removed the PromptTemplate class as the chat template now handles prompt formatting
4. Replaced pipeline use (preprocess, forward, and postprocess) as the chat template for Llama 3.1 is a list of dictionaries, but pipeline does not accept it
5. Made changes to padding side and token, stacked padded tensors into a single batch tensor as input to the model
6. Added a new parameter is_prompt_contrastive to handle contrastive prompts
init.py: Added Llama 3 and 3.1 models
prompts.py: New script to handle positive and negative prompts
mt_task.py: Updated the evaluate method to handle contrastive prompt pairs
run.py: Added two arguments to handle contrastive prompt pairs
utils_run.py and utils_llama.py: updated language codes for FLORES+ dataset

Folder Structure

annotations/: Manual annotation files (error analysis, omissions)
notebooks/: Notebooks for demos and reproducing thesis results
outputs/: Translation outputs and evaluation results generated in this thesis
predictions/: Original outputs from Sennrich et al. (EACL 2024), for comparison/reference
scripts/: Main experiment scripts and helper utilities
tests/: Unit tests for core modules from Sennrich et al. (EACL 2024)
translation_models/: Model wrappers and utilities (Llama, m2m100, small100)
illustration.png, logo.png: Visual assets for documentation/thesis
LICENSE, README.md, requirements.txt: Repository metadata and setup

Virtual Environment

Set up an virtual environment

python3 -m venv venv for Linux/Mac

or

python -m venv venv for Windows

Activate the virtual environment

source venv/bin/activate for Linux/Mac
venv\Scripts\activate for Windows

Installation

pip install -r requirements.txt

Usage

For prompt-contrastive decoding with user message:
Use the Python scripts as described below.
For prompt-contrastive decoding with system prompt:
Please run the relevant notebook in notebooks/.

Example commands

Source-contrastive decoding with M2M-100 (418M) on Asturian–Croatian, with λ_src=0.7:

python -m scripts.run --model_path m2m100_418M --language_pairs ast-hr --source_contrastive --source_weight -0.7

Source-contrastive and language-contrastive decoding with SMaLL-100 on Pashto–Asturian, with 2 random source segments, λ_src=0.7, λ_lang=0.1, and English and Pashto as contrastive target languages:

python -m scripts.run --model_path small100 --language_pairs ps-ast --source_contrastive 2 --source_weight -0.7 --language_contrastive en ps --language_weight -0.1

Prompt-contrastive decoding with Llama 3.1 8B Instruct on Mongolian-English, with λ_prompt=0.1 and one contrastive prompt pair appended to user message:

python -m scripts.run --model_path llama-3.1-8b-instruct --language_pairs mn-en --prompt_contrastive --prompt_weight -0.1

Source-contrastive and prompt-contrastive decoding with Llama 3.1 8B Instruct on Igbo-English, with 1 random source segment, λ_src=0.7, λ_prompt=0.1:

python -m scripts.run --model_path llama-3.1-8b-instruct --language_pairs ig-en --source_contrastive --source_weight -0.7 --prompt_contrastive --prompt_weight -0.1

Or run the provided notebook for a full Colab demo.

Dataset and Models:

FLORES-101, as in original repo
FLORES_Plus. devtest section is used for the evaluation.

Multiple models are implemented:

M2M-100 (418M). Use --model_path m2m100_418M
SMaLL-100. Use --model_path small100
Llama 3.1 8B Instruct. Use --model_path llama-3.1-8b-instruct

Evaluation

ChrF2:

sacrebleu ref.txt < output.txt --metrics chrf

spBLEU:

sacrebleu ref.txt < output.txt --tokenize flores101

MetricX-23-XL: Run the provided notebook.

Reference

@inproceedings{sennrich-etal-2024-mitigating,
      title={Mitigating Hallucinations and Off-target Machine Translation with Source-Contrastive and Language-Contrastive Decoding}, 
      author={Rico Sennrich and Jannis Vamvas and Alireza Mohammadshahi},
      booktitle={18th Conference of the European Chapter of the Association for Computational Linguistics},
      year={2024}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Reducing Omission Errors in Machine Translation with Prompt-Contrastive Decoding

Overview

Main Features and Changes

Relation to Original Codebase

Folder Structure

Virtual Environment

Set up an virtual environment

or

Activate the virtual environment

Installation

Usage

Dataset and Models:

Evaluation

Reference

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
annotations		annotations
notebooks		notebooks
outputs/flores+		outputs/flores+
predictions/llama_flores		predictions/llama_flores
scripts		scripts
tests		tests
translation_models		translation_models
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
illustration.png		illustration.png
logo.png		logo.png
mt_task.py		mt_task.py
requirements.txt		requirements.txt

License

xiaojing29/PromptContrastiveDecoding

Folders and files

Latest commit

History

Repository files navigation

Reducing Omission Errors in Machine Translation with Prompt-Contrastive Decoding

Overview

Main Features and Changes

Relation to Original Codebase

Folder Structure

Virtual Environment

Set up an virtual environment

or

Activate the virtual environment

Installation

Usage

Dataset and Models:

Evaluation

Reference

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages