Indicate: Transliterate Indic Languages to English

https://app.travis-ci.com/in-rolls/indicate.svg?branch=master

Transliterations to/from Indian languages are still generally low quality. One problem is access to data. Another is that there is no standard transliteration. For Hindi--English, we build novel dataset for names using the ESPNcricinfo. For instance, see here for hindi version of the english scorecard. We also create a dataset from election affidavits We also exploit the Google Dakshina dataset.

To overcome the fact that there isn't one standard way of transliteration, we provide k-best transliterations.

Install

We strongly recommend installing indicate inside a Python virtual environment (see venv documentation)

pip install indicate

General API

transliterate.hindi2english will take Hindi text and translate into English.

Examples

from indicate import transliterate
english_translated = transliterate.hindi2english("हिंदी")
print(english_translated)

output - hindi

Functions

We expose 1 function, which will take Hindi text and transliterate it to English.

transliterate.hindi2english(input)
- What it does:
  - Converts given hindi text into English alphabet
- Output
  - Returns text in English

Data

The datasets used to train the model:

Indian Election affidavits
Google Dakshina dataset
ESPN Cric Info for hindi version of the english scorecard.
IIT Bombay English-Hindi Corpus

Evaluation

Model was evaluated on test dataset of Google Dakshina dataset, Model predicted 73.64% exact matches. Indic-trans predicted 63.12% exact matches on Google Dakshina dataset. Below is the edit distance metrics on test dataset (0.0 mean exact match, the farther away from 0.0, the difference is more between predicted text and actual text)

Authors

Rajashekar Chintalapati and Gaurav Sood

Contributor Code of Conduct

The project welcomes contributions from everyone! In fact, it depends on it. To maintain this welcoming atmosphere, and to collaborate in a fun and productive way, we expect contributors to the project to abide by the Contributor Code of Conduct.

License

The package is released under the MIT License.

Name	Name	Last commit message	Last commit date
Latest commit rajashekar Merge pull request #5 from in-rolls/upgrade-model Feb 15, 2025 41e5774 · Feb 15, 2025 History 67 Commits
.github	.github	add issue templates	Dec 16, 2021
docs	docs	fix sphinx docs (still basic no autodocs)	Feb 14, 2023
images	images	Refactor: Model trained with attention	Dec 16, 2021
indicate	indicate	upgrade tensorflow version and model	Feb 15, 2025
.gitattributes	.gitattributes	all but setup.py	Oct 29, 2021
.gitignore	.gitignore	refactor: Adding documentation	Nov 12, 2021
.travis.yml	.travis.yml	Fixing python versions in travis	Nov 10, 2021
Citation.cff	Citation.cff	Create Citation.cff	Sep 19, 2022
LICENSE	LICENSE	all but setup.py	Oct 29, 2021
MANIFEST.in	MANIFEST.in	all but setup.py	Oct 29, 2021
README.rst	README.rst	Update README.rst	Aug 17, 2023
appveyor.yml	appveyor.yml	replace travis with github actions and fix appveyor miniconda	Nov 16, 2021
readthedocs.yml	readthedocs.yml	all but setup.py	Oct 29, 2021
requirements.txt	requirements.txt	upgrade tensorflow version and model	Feb 15, 2025
requirements_rtd.txt	requirements_rtd.txt	upgrade tensorflow version and model	Feb 15, 2025
setup.cfg	setup.cfg	refactor: Adding documentation	Nov 12, 2021
setup.py	setup.py	upgrade tensorflow version and model	Feb 15, 2025
tox.ini	tox.ini	all but setup.py	Oct 29, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Indicate: Transliterate Indic Languages to English

Install

General API

Examples

Functions

Data

Evaluation

Authors

Contributor Code of Conduct

License

About

Releases

Packages

Contributors 5

Languages

License

in-rolls/indicate

Folders and files

Latest commit

History

Repository files navigation

Indicate: Transliterate Indic Languages to English

Install

General API

Examples

Functions

Data

Evaluation

Authors

Contributor Code of Conduct

License

About

Topics

Resources

License

Citation

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Languages

Packages