iikim_translator

This repository contains translation models specifically designed for the Mayan languages, Yucatec Mayan and Chol. It's part of a project aimed at advancing neural machine translation (NMT) for indigenous languages, leveraging Spanish-trained large language models (LLMs) due to the scarcity of comprehensive parallel corpora for these languages.

Overview

Advancing NMT for Indigenous Languages: This project focuses on a case study involving Yucatec Mayan and Chol, two Mayan languages with significant historical and cultural importance yet limited digital resources. By implementing a prompt-based approach to train both one-to-many and many-to-many models, this study demonstrates the viability of using LLMs to improve accessibility and preservation for languages with limited digital resources. The best-performing models reached a ChrF++ score of 50 on the test set, indicating a strong potential for these approaches in real-world applications.

Project Goals

Develop and refine NMT systems for Mayan languages.
Bridge the digital gap for Mayan language communities, enhancing access to vital services.
Promote collaboration and further research in NLP for indigenous languages.

Training

M2M model

./train/m2m100_418M/mayan.sh

T5-base-spanish model

./train/t5-base-spanish/mayan.sh

Testing

M2M model

./test/m2m100_418M/mayan.sh

T5-base-spanish model

./test/t5-base-spanish/mayan.sh

HF Models

Visit the following Hugging Face model repositories to access the pre-trained models:

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
configs		configs
data/mayan		data/mayan
spanat		spanat
test		test
train		train
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

iikim_translator

Overview

Project Goals

Training

M2M model

T5-base-spanish model

Testing

M2M model

T5-base-spanish model

HF Models

About

Releases

Packages

Contributors 2

Languages

RIKEN-DKO/iikim_translator

Folders and files

Latest commit

History

Repository files navigation

iikim_translator

Overview

Project Goals

Training

M2M model

T5-base-spanish model

Testing

M2M model

T5-base-spanish model

HF Models

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages