-
Notifications
You must be signed in to change notification settings - Fork 131
Adding a new language
Here we describe the steps to add a new language in Tatoeba from a development point of view.
- Open the file
src/Lib/LanguagesLib.php
- Add the language in the
$languages
array in the functionlanguagesInTatoeba()
. - If the language has an ISO 639-1 code, add it in the
$map
array in the functionget_Iso639_3_To_Iso639_1_Map()
. - If the language only has one writing system and is written right to left, add the language code in the
$rightToLeftLangs
array in the functiongetLanguageDirection($lang)
. - If the language has more than one writing system and can be written either right to left or left to right, add the language code in the
$autoLangs
array in the functiongetLanguageDirection($lang)
.
- Language icons are located in
webroot/img/flags
. - The icon needs to be a SVG file, named with the ISO 639-3 language code (ex:
ita.svg
).
If a SVG file is provided in the language request:
- Download the file, rename it accordingly and put it in the
webroot/img/flags
folder. - Open the file with a text editor:
- Remove comments if there are any.
- Remove the "id" attribute from the
<svg>
tag if there is any. - Not doing so will trigger an error with the asset build (see for example issue #2911).
- Optionally, if you are familiar enough with SVG, feel free to see if you can optimize the code to make the file size as small as possible.
If only a PNG file is provided in the language request, you will need to create a SVG out of this PNG, as followed:
- Compress the PNG icon (https://compresspng.com/).
- Convert it into data URI (https://ezgif.com/image-to-datauri).
- Download the SVG template.
- Open the template in a text editor (such as Notepad++).
- In the template, replace
{dataURI}
by the string that you got from converting the PNG to data URI. - Save the changes, name the file accordingly and put it in the
webroot/img/flags
folder.
In some cases, the new language will use a script that is not yet handled by the search engine. The new characters need to be defined in the SphinxConfShell.php file, otherwise the sentences in the new language cannot be indexed and searched.
- Open the /src/Shell/SphinxConfShell.php file.
- Add the Unicode block
- in the
$charsetTable
table if the language has spaces to separate words (like in English) - in the
$scriptsWithoutWordBoundaries
table if the language has no spaces (like in Chinese)
- in the
Example with Dhivehi: https://github.com/Tatoeba/tatoeba2/pull/1855#issuecomment-480477645
This step is only useful if you have Tatoeba installed locally and want to test the changes that you did in the previous steps.
For this, you can follow the instructions found in the Adding a new language section of the Deployment page. This page describes how to add the language on production, but it is the same steps in local environment if you have installed TatoVM.
If you have any question, if something in this page was not clear enough for you, or if you have suggestions to improve it, please let us know: [email protected].