- Go to sjp.pl and download
sjp-ispell-pl-[date]-src.tar.bz2
file. - Unpack the file and
cd
to the directory. - Download Polish stopwords from GitHub repo and save it in the current directory.
- Convert encodings to utf-8 and give proper extensions to files:
$ iconv -f ISO_8859-2 -t utf-8 polish.aff > polish.affix $ iconv -f ISO_8859-2 -t utf-8 polish.all > polish.dict $ mv polish.stopwords.txt polish.stop
- Copy files to
pg_config --sharedir
directory:$ sudo cp polish.affix `pg_config --sharedir`/tsearch_data/ $ sudo cp polish.dict `pg_config --sharedir`/tsearch_data/ $ sudo cp polish.stop `pg_config --sharedir`/tsearch_data/
- Now, in postgres:
CREATE TEXT SEARCH DICTIONARY pl_ispell ( Template = ispell, DictFile = polish, AffFile = polish, StopWords = polish ); CREATE TEXT SEARCH CONFIGURATION pl_ispell(parser = default); ALTER TEXT SEARCH CONFIGURATION pl_ispell ALTER MAPPING FOR asciiword, asciihword, hword_asciipart, word, hword, hword_part WITH pl_ispell;
- Test, in postgres:
SELECT to_tsvector('pl_ispell', 'Czuję się mniej więcej tak, jak ktoś, kto bujał w obłokach i nagle spadł.'); output: to_tsvector ------------------------------------------------------------------------------------ 'bujać':9 'czuć':1 'jaka':6 'mniej':3 'nagły':13 'obłok':11 'obłoki':11 'spaść':14 (1 row)
-
Notifications
You must be signed in to change notification settings - Fork 2
PostgreSQL Full-Text Search - Polish Dictionary
License
dominem/postgresql_fts_polish_dict
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
PostgreSQL Full-Text Search - Polish Dictionary
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published