-
-
Notifications
You must be signed in to change notification settings - Fork 24
HOWTO Add a New Locale
Mickaël Schoentgen edited this page Nov 8, 2023
·
3 revisions
- Copy an existing lang file from the
lang
folder. Remove all data from the old lang. - Copy an existing test file from the
tests
folder. Remove all data from the old lang. - Update
lang/__init__.py
accordingly. - Add the locale code in
scripts/all-namespaces.py
andrun python -m scripts
- Test it:
python -m pytest tests/tests_$LOCALE.py
- When you think you are ready, fetch and convert all words:
# Run the command that will fetch the data and convert it into dicthtml-$LOCALE.zip python -m wikidict $LOCALE
That's it! Thanks a lot for your contribution ❤️
When done, a maintainer will:
- Create a new release with the tag
$LOCALE
. This is where the dictionary will be uploaded. - Update that
README
to include the new locale in the Dictionaries section. Keep it alphabetically sorted please, and use the original locale for the language name, not english.
You first need to find the right head_sections
and section_level
.
Then:
python -m wikidict $LOCALE --find-templates
The file sections.txt
is created.
When sections are set, you can now find templates:
python -m wikidict $LOCALE --find-templates
The file templates.txt
is created.
Have a look at sections.txt
first, and then at templates.txt
.
When you find a new section or template, add a test (you can have a look at existant tests).
You can also get the definition quickly for a word and see the formatting:
python -m wikidict $LOCALE --get-word "word" [--raw]
Run that script:
# File: parenthesis.py
import json
import re
import sys
with open(sys.argv[1]) as fh:
words = json.load(fh)
seen = set()
pattern = re.compile(r"(\([A-Z]+[^\)]+\))")
for word, definitions in words.items():
for definition in definitions[-1]:
if isinstance(definition, str):
for m in pattern.findall(definition):
if m not in seen:
print(m, repr(word))
seen.add(m)
else:
for subdef in definition:
for m in pattern.findall(subdef):
if m not in seen:
print(m, repr(word))
seen.add(m)
Use it like:
python parenthesis.py data/$LOCALE/data.json
It will output all words in parenthesis (most of them are templates), just check that nothing seems weird: else it will mean that you have another template to handle :)