Skip to content

Releases: PyThaiNLP/pythainlp

PyThaiNLP v5.0.5 Released!

14 Dec 14:20
0c20956
Compare
Choose a tag to compare

PyThaiNLP v5.0.5 is a bug fix release of PyThaiNLP v5.0.

Install: pip install pythainlp
Upgrade: pip install -U pythainlp

See PyThaiNLP 5.0 Change Log: #788.

What's Changed

Full Changelog: v5.0.4...v5.0.5

PyThaiNLP v5.0.4 Released!

02 Jun 14:49
Compare
Choose a tag to compare

PyThaiNLP v5.0.4 is a bug fix release of PyThaiNLP v5.0.3.

Install: pip install pythainlp
Upgrade: pip install -U pythainlp

See PyThaiNLP 5.0 Change Log: #788.

What's Changed

Full Changelog: v5.0.3...v5.0.4

PyThaiNLP v5.0.3 Released!

12 May 10:29
Compare
Choose a tag to compare

PyThaiNLP v5.0.3 is a bug fix release of PyThaiNLP v5.0.2.

Install: pip install pythainlp
Upgrade: pip install -U pythainlp

See PyThaiNLP 5.0 Change Log: #788.

What's Changed

  • Create .editorconfig by @bact in #909
  • Fix empty string ('') added (in some cases) when using word_tokenize with join_broken_num=True by @S2P2 in #912

New Contributors

  • @S2P2 made their first contribution in #912

Full Changelog: v5.0.2...v5.0.3

PyThaiNLP v5.0.2 Released!

03 Apr 10:20
Compare
Choose a tag to compare

PyThaiNLP v5.0.2 is a bug fix release of PyThaiNLP v5.0.1.

Install: pip install pythainlp
Upgrade: pip install -U pythainlp

See PyThaiNLP 5.0 Change Log: #788.

What's Changed

New Contributors

Full Changelog: v5.0.1...v5.0.2

Contributors

Thanks all the contributors. (Image made with contributors-img)

PyThaiNLP v5.0.1 Released!

10 Feb 15:10
Compare
Choose a tag to compare

PyThaiNLP v5.0.1 is a bug fix release of PyThaiNLP v5.0.0.

Install: pip install pythainlp
Upgrade: pip install -U pythainlp

See PyThaiNLP 5.0 Change Log: #788.

What's Changed

  • Fixed bug: ImportError pycrfsuite #901

Full Changelog: v5.0.0...v5.0.1

Contributors

Thanks all the contributors. (Image made with contributors-img)

PyThaiNLP v5.0.0 Released!

10 Feb 05:31
Compare
Choose a tag to compare

We are excited to announce the latest release of PyThaiNLP - version 5.0! PyThaiNLP is a Python library for Thai natural language processing (NLP). We are welcome to release PyThaiNLP 5.0!

With PyThaiNLP 5.0, you can expect improved performance and accuracy for NLP tasks in Thai. We have also added new functions to make your NLP tasks even easier and more efficient.

Install: pip install pythainlp
Upgrade: pip install -U pythainlp

See PyThaiNLP 5.0 Change Log: #788.

What is new?

License information

Deprecation and other API changes

  • Change default NER to thainer-v2 5e97e7c
  • Move pythainlp.util.is_native_thai to pythainlp.morpheme.is_native_thai 524759a

Dependency

New API

Improve

  • Update code comments and clean up codes by @BLKSerene in #845
  • Improving the documentation byt fixing the typos, adding necesarry details and explanation of the code and the missing necessary details about model and example. by @Saharshjain78 in #850
  • Fix tests of khavee functions by @BLKSerene in #854
  • Update Git Actions versions by @bact in #878
  • Fix ruff args in workflow by @bact in #880
  • Revise ruff args in workflow by @bact in #881
  • Fix coref return type and add fallback by @bact in #883
  • Fix wrong/incompatible types, code readability by @bact in #884
  • Bump protobuf from 3.20 to 3.20.2 by #885
  • Add license info to /tests and README_TH.md by @bact in #886
  • phayathaibert, khavee, parse: Code clean up by @bact in #889
  • ruff: docstring-code-format = true by @bact in #892

Tokenizer

  • Add wtpsplit engine to sentence_tokenize #804
  • New paragraph_tokenize funtion to split Thai text to a paragraph #804
  • Add paragraph_threshold into paragraph_tokenize() function #806 by @pavaris-pm in
  • Add 🪿 Han-solo by @wannaphong in #830
  • Fix newmm to better handle non-Thai characters in tokens #856 by @konbraphat51
  • Fix incorrect passing of flags to re.split by @hauntsaninja in #832
  • Add syllable_tokenize by @wannaphong in #834
  • Add wanchanberta_thai_grammarly by @wannaphong in #836
  • Add extra segmentation style for paragraph_tokenize function by @pavaris-pm in #844
  • Improve: [newmm tokenizer] Change regular expression of "non-thai-characters" by @konbraphat51 in #856

Tag

Chat

Translate

Transliterate

Corpus

  • Add pythainlp.corpus.thai_orst_words() Thai word list from Royal Society of Thailand (ORST) #810 by @wannaphong
  • Add pythainlp.corpus.thai_wikipedia_titles() Thai word list (noun and noun phrases) from Thai Wikipedia titles #869 by @konbraphat51
  • Add pythainlp.corpus.thai_volubilis_words() Thai word list from Volubilis dictionary #870 by @konbraphat51
  • Add pythainlp.corpus.thai_icu_words() Thai word list from ICU BreakIterator dictionary #879 by @pavaris-pm
  • Rename Volubilis/Wikipedia corpus function names for consistency / Fix types by @bact in #882

Util

New Contributors

Full Changelog: v4.0.2...v5.0.0

Contributors

Thanks all the contributors. (Image made with contributors-img)

PyThaiNLP v5.0.0-beta1

05 Feb 05:37
Compare
Choose a tag to compare
Pre-release

Schedule

  • First Beta release: 5 February 2024
  • Production release: 10 February 2024

See 5.0 Milestone.

What is new?

License information

  • Use SPDX license identifier at the header of source code #876

Deprecation and other API changes

  • Change default NER to thainer-v2 5e97e7c
  • Move pythainlp.util.is_native_thai to pythainlp.morpheme.is_native_thai 524759a

Dependency

New API

Improve

  • Update code comments and clean up codes by @BLKSerene in #845
  • Improving the documentation byt fixing the typos, adding necesarry details and explanation of the code and the missing necessary details about model and example. by @Saharshjain78 in #850
  • Fix tests of khavee functions by @BLKSerene in #854
  • Update Git Actions versions by @bact in #878
  • Fix ruff args in workflow by @bact in #880
  • Revise ruff args in workflow by @bact in #881
  • Fix coref return type and add fallback by @bact in #883
  • Fix wrong/incompatible types, code readability by @bact in #884
  • Bump protobuf from 3.20 to 3.20.2 by #885
  • Add license info to /tests and README_TH.md by @bact in #886
  • phayathaibert, khavee, parse: Code clean up by @bact in #889
  • ruff: docstring-code-format = true by @bact in #892

Tokenizer

  • Add wtpsplit engine to sentence_tokenize #804
  • New paragraph_tokenize funtion to split Thai text to a paragraph #804
  • Add paragraph_threshold into paragraph_tokenize() function #806 by @pavaris-pm in
  • Add 🪿 Han-solo by @wannaphong in #830
  • Fix newmm to better handle non-Thai characters in tokens #856 by @konbraphat51
  • Fix incorrect passing of flags to re.split by @hauntsaninja in #832
  • Add syllable_tokenize by @wannaphong in #834
  • Add wanchanberta_thai_grammarly by @wannaphong in #836
  • Add extra segmentation style for paragraph_tokenize function by @pavaris-pm in #844
  • Improve: [newmm tokenizer] Change regular expression of "non-thai-characters" by @konbraphat51 in #856

Tag

Chat

Translate

Transliterate

Corpus

  • Add pythainlp.corpus.thai_orst_words() Thai word list from Royal Society of Thailand (ORST) #810 by @wannaphong
  • Add pythainlp.corpus.thai_wikipedia_titles() Thai word list (noun and noun phrases) from Thai Wikipedia titles #869 by @konbraphat51
  • Add pythainlp.corpus.thai_volubilis_words() Thai word list from Volubilis dictionary #870 by @konbraphat51
  • Add pythainlp.corpus.thai_icu_words() Thai word list from ICU BreakIterator dictionary #879 by @pavaris-pm
  • Rename Volubilis/Wikipedia corpus function names for consistency / Fix types by @bact in #882

Util

New Contributors

PyThaiNLP v5.0.0-dev2

15 Jan 07:49
Compare
Choose a tag to compare
PyThaiNLP v5.0.0-dev2 Pre-release
Pre-release

What's Changed

Full Changelog: v5.0.0-dev1...v5.0.0-dev2

PyThaiNLP v5.0.0-dev1

19 Dec 15:48
Compare
Choose a tag to compare
PyThaiNLP v5.0.0-dev1 Pre-release
Pre-release

What's Changed

Full Changelog: v5.0.0-dev0...v5.0.0-dev1

PyThaiNLP v5.0.0-dev0

26 Nov 09:22
abfbf02
Compare
Choose a tag to compare
PyThaiNLP v5.0.0-dev0 Pre-release
Pre-release

What's Changed

  • Add extra segmentation style for paragraph_tokenize function by @pavaris-pm in #844
  • Update code comments and clean up codes by @BLKSerene in #845
  • Improving the documentation byt fixing the typos, adding necesarry details and explanation of the code and the missing necessary details about model and example. by @Saharshjain78 in #850
  • Fix ISO 11940 duplicate keys by @bact in #851
  • Add pythainlp.util.rhyme by @wannaphong in #849
  • Fix duplicate key in IPA to RTGS phoneme mapping by @BLKSerene in #852
  • Fix tests of khavee functions by @BLKSerene in #854
  • Improve: [newmm tokenizer] Change regular expression of "non-thai-characters" by @konbraphat51 in #856
  • add function for pos tag with transformers by @MpolaarbearM in #857
  • Add: remove_trailing_repeat_consonants() by @konbraphat51 in #862
  • Update pos_tag_transformers function by @pavaris-pm in #865

New Contributors

Full Changelog: v4.1.0-beta5...v5.0.0-dev0