Pythainlp Versions Save

Thai Natural Language Processing in Python.

v5.0.3

1 day ago

PyThaiNLP v5.0.3 is a bug fix release of PyThaiNLP v5.0.2.

Install: pip install pythainlp Upgrade: pip install -U pythainlp

See PyThaiNLP 5.0 Change Log: https://github.com/PyThaiNLP/pythainlp/issues/788.

What's Changed

New Contributors

Full Changelog: https://github.com/PyThaiNLP/pythainlp/compare/v5.0.2...v5.0.3

v5.0.2

1 month ago

PyThaiNLP v5.0.2 is a bug fix release of PyThaiNLP v5.0.1.

Install: pip install pythainlp Upgrade: pip install -U pythainlp

See PyThaiNLP 5.0 Change Log: https://github.com/PyThaiNLP/pythainlp/issues/788.

What's Changed

New Contributors

Full Changelog: https://github.com/PyThaiNLP/pythainlp/compare/v5.0.1...v5.0.2

Contributors

Thanks all the contributors. (Image made with contributors-img)

v5.0.1

3 months ago

PyThaiNLP v5.0.1 is a bug fix release of PyThaiNLP v5.0.0.

Install: pip install pythainlp Upgrade: pip install -U pythainlp

See PyThaiNLP 5.0 Change Log: https://github.com/PyThaiNLP/pythainlp/issues/788.

What's Changed

  • Fixed bug: ImportError pycrfsuite #901

Full Changelog: https://github.com/PyThaiNLP/pythainlp/compare/v5.0.0...v5.0.1

Contributors

Thanks all the contributors. (Image made with contributors-img)

v5.0.0

3 months ago

We are excited to announce the latest release of PyThaiNLP - version 5.0! PyThaiNLP is a Python library for Thai natural language processing (NLP). We are welcome to release PyThaiNLP 5.0!

With PyThaiNLP 5.0, you can expect improved performance and accuracy for NLP tasks in Thai. We have also added new functions to make your NLP tasks even easier and more efficient.

Install: pip install pythainlp Upgrade: pip install -U pythainlp

See PyThaiNLP 5.0 Change Log: https://github.com/PyThaiNLP/pythainlp/issues/788.

What is new?

License information

Deprecation and other API changes

Dependency

  • Add tzdata as a dependency on Windows by @BLKSerene in #841

New API

  • Add pythainlp.coref for Thai coreference resolution #802
  • Add wtpsplit to sentence segmentation & paragraph segmentation #804 and add paragraph_threshold into paragraph_tokenize() function #806
  • Add word approximation to pythainlp.soundex.sound #809 by @wannaphong
  • Add pythainlp.wsd for Thai word sense disambiguation #818 by @wannaphong
  • Add pythainlp.chat and WangChanGLM to pythainlp.generate #819 by @wannaphong
  • Add pythainlp.cls a param-free classification model #821 by @c4n
  • Add pythainlp.el entity linking #822 by @wannaphong
  • Add pythainlp.ancient by @wannaphong in #833
  • Add pythainlp.util.rhyme by @wannaphong in #849
  • Add remove_trailing_repeat_consonants by @konbraphat51 in #862
  • Add pythainlp.util.to_idn by @wannaphong in #875
  • Add pythainlp.corpus.find_synonyms by @wannaphong in #890
  • Add pythainlp.util.morse by @wannaphong in #891
  • Add pythainlp.morpheme by @wannaphong in #896

Improve

  • Update code comments and clean up codes by @BLKSerene in #845
  • Improving the documentation byt fixing the typos, adding necesarry details and explanation of the code and the missing necessary details about model and example. by @Saharshjain78 in #850
  • Fix tests of khavee functions by @BLKSerene in #854
  • Update Git Actions versions by @bact in #878
  • Fix ruff args in workflow by @bact in #880
  • Revise ruff args in workflow by @bact in #881
  • Fix coref return type and add fallback by @bact in #883
  • Fix wrong/incompatible types, code readability by @bact in #884
  • Bump protobuf from 3.20 to 3.20.2 by #885
  • Add license info to /tests and README_TH.md by @bact in #886
  • phayathaibert, khavee, parse: Code clean up by @bact in #889
  • ruff: docstring-code-format = true by @bact in #892

Tokenizer

  • Add wtpsplit engine to sentence_tokenize #804
  • New paragraph_tokenize funtion to split Thai text to a paragraph #804
  • Add paragraph_threshold into paragraph_tokenize() function #806 by @pavaris-pm in
  • Add 🪿 Han-solo by @wannaphong in #830
  • Fix newmm to better handle non-Thai characters in tokens #856 by @konbraphat51
  • Fix incorrect passing of flags to re.split by @hauntsaninja in #832
  • Add syllable_tokenize by @wannaphong in #834
  • Add wanchanberta_thai_grammarly by @wannaphong in #836
  • Add extra segmentation style for paragraph_tokenize function by @pavaris-pm in #844
  • Improve: [newmm tokenizer] Change regular expression of "non-thai-characters" by @konbraphat51 in #856

Tag

  • Add function for pos tag with transformers by @MpolaarbearM in #857
  • Update pos_tag_transformers function by @pavaris-pm in #865
  • Add PhayaThaiBERT engine with new features by @pavaris-pm in #873

Chat

  • Fixed bug #828

Translate

  • Add small100 to pythainlp.translate #815 by @wannaphong

Transliterate

  • Fix duplicate keys in ISO 11940 and IPA-RTGS phoneme mapping #851 #852 by @BLKSerene and @bact
  • Fix duplicate key in IPA to RTGS phoneme mapping by @BLKSerene in #852

Corpus

  • Add pythainlp.corpus.thai_orst_words() Thai word list from Royal Society of Thailand (ORST) #810 by @wannaphong
  • Add pythainlp.corpus.thai_wikipedia_titles() Thai word list (noun and noun phrases) from Thai Wikipedia titles #869 by @konbraphat51
  • Add pythainlp.corpus.thai_volubilis_words() Thai word list from Volubilis dictionary #870 by @konbraphat51
  • Add pythainlp.corpus.thai_icu_words() Thai word list from ICU BreakIterator dictionary #879 by @pavaris-pm
  • Rename Volubilis/Wikipedia corpus function names for consistency / Fix types by @bact in #882

Util

  • Add pythainlp.util.encoding #813 by @wannaphong
  • Add pythainlp.util.spell_words #817 by @wannaphong
  • Add pythainlp.util.remove_trailing_repeat_consonants() #862 by @konbraphat51

New Contributors

  • @pavaris-pm made their first contribution in #806
  • @hauntsaninja made their first contribution in #832
  • @Saharshjain78 made their first contribution in #850
  • @konbraphat51 made their first contribution in #856
  • @MpolaarbearM made their first contribution in #857

Full Changelog: https://github.com/PyThaiNLP/pythainlp/compare/v4.0.2...v5.0.0

Contributors

Thanks all the contributors. (Image made with contributors-img)

v5.0.0-beta1

3 months ago

Schedule

  • First Beta release: 5 February 2024
  • Production release: 10 February 2024

See 5.0 Milestone.

What is new?

License information

  • Use SPDX license identifier at the header of source code #876

Deprecation and other API changes

Dependency

  • Add tzdata as a dependency on Windows by @BLKSerene in #841

New API

  • Add pythainlp.coref for Thai coreference resolution #802
  • Add wtpsplit to sentence segmentation & paragraph segmentation #804 and add paragraph_threshold into paragraph_tokenize() function #806
  • Add word approximation to pythainlp.soundex.sound #809 by @wannaphong
  • Add pythainlp.wsd for Thai word sense disambiguation #818 by @wannaphong
  • Add pythainlp.chat and WangChanGLM to pythainlp.generate #819 by @wannaphong
  • Add pythainlp.cls a param-free classification model #821 by @c4n
  • Add pythainlp.el entity linking #822 by @wannaphong
  • Add pythainlp.ancient by @wannaphong in #833
  • Add pythainlp.util.rhyme by @wannaphong in #849
  • Add: remove_trailing_repeat_consonants by @konbraphat51 in #862
  • Add pythainlp.util.to_idn by @wannaphong in #875
  • Add pythainlp.corpus.find_synonyms by @wannaphong in #890
  • Add pythainlp.util.morse by @wannaphong in #891
  • Add pythainlp.morpheme by @wannaphong in #896

Improve

  • Update code comments and clean up codes by @BLKSerene in #845
  • Improving the documentation byt fixing the typos, adding necesarry details and explanation of the code and the missing necessary details about model and example. by @Saharshjain78 in #850
  • Fix tests of khavee functions by @BLKSerene in #854
  • Update Git Actions versions by @bact in #878
  • Fix ruff args in workflow by @bact in #880
  • Revise ruff args in workflow by @bact in #881
  • Fix coref return type and add fallback by @bact in #883
  • Fix wrong/incompatible types, code readability by @bact in #884
  • Bump protobuf from 3.20 to 3.20.2 by #885
  • Add license info to /tests and README_TH.md by @bact in #886
  • phayathaibert, khavee, parse: Code clean up by @bact in #889
  • ruff: docstring-code-format = true by @bact in #892

Tokenizer

  • Add wtpsplit engine to sentence_tokenize #804
  • New paragraph_tokenize funtion to split Thai text to a paragraph #804
  • Add paragraph_threshold into paragraph_tokenize() function #806 by @pavaris-pm in
  • Add 🪿 Han-solo by @wannaphong in #830
  • Fix newmm to better handle non-Thai characters in tokens #856 by @konbraphat51
  • Fix incorrect passing of flags to re.split by @hauntsaninja in #832
  • Add syllable_tokenize by @wannaphong in #834
  • Add wanchanberta_thai_grammarly by @wannaphong in #836
  • Add extra segmentation style for paragraph_tokenize function by @pavaris-pm in #844
  • Improve: [newmm tokenizer] Change regular expression of "non-thai-characters" by @konbraphat51 in #856

Tag

  • add function for pos tag with transformers by @MpolaarbearM in #857
  • Update pos_tag_transformers function by @pavaris-pm in #865
  • Add PhayaThaiBERT engine with new features by @pavaris-pm in #873

Chat

  • Fixed bug #828

Translate

  • Add small100 to pythainlp.translate #815 by @wannaphong

Transliterate

  • Fix duplicate keys in ISO 11940 and IPA-RTGS phoneme mapping #851 #852 by @BLKSerene and @bact
  • Fix duplicate key in IPA to RTGS phoneme mapping by @BLKSerene in #852

Corpus

  • Add pythainlp.corpus.thai_orst_words() Thai word list from Royal Society of Thailand (ORST) #810 by @wannaphong
  • Add pythainlp.corpus.thai_wikipedia_titles() Thai word list (noun and noun phrases) from Thai Wikipedia titles #869 by @konbraphat51
  • Add pythainlp.corpus.thai_volubilis_words() Thai word list from Volubilis dictionary #870 by @konbraphat51
  • Add pythainlp.corpus.thai_icu_words() Thai word list from ICU BreakIterator dictionary #879 by @pavaris-pm
  • Rename Volubilis/Wikipedia corpus function names for consistency / Fix types by @bact in #882

Util

  • Add pythainlp.util.encoding #813 by @wannaphong
  • Add pythainlp.util.spell_words #817 by @wannaphong
  • Add pythainlp.util.remove_trailing_repeat_consonants() #862 by @konbraphat51

New Contributors

  • @pavaris-pm made their first contribution in #806
  • @hauntsaninja made their first contribution in #832
  • @Saharshjain78 made their first contribution in #850
  • @konbraphat51 made their first contribution in #856
  • @MpolaarbearM made their first contribution in #857

v5.0.0-dev2

3 months ago

v5.0.0-dev1

4 months ago

What's Changed

Full Changelog: https://github.com/PyThaiNLP/pythainlp/compare/v5.0.0-dev0...v5.0.0-dev1

v5.0.0-dev0

5 months ago

What's Changed

New Contributors

Full Changelog: https://github.com/PyThaiNLP/pythainlp/compare/v4.1.0-beta5...v5.0.0-dev0

v4.1.0-beta5

7 months ago

Docs: https://pythainlp.github.io/dev-docs/ Report bug: https://github.com/PyThaiNLP/pythainlp/issues

Install: pip install --pre pythanlp

See 4.1 Milestone.

What's Changed

Full Changelog: https://github.com/PyThaiNLP/pythainlp/compare/v4.1.0-beta4...v4.1.0-beta5

v4.1.0-beta4

8 months ago

Docs: https://pythainlp.github.io/dev-docs/ Report bug: https://github.com/PyThaiNLP/pythainlp/issues

Install: pip install --pre pythanlp

See 4.1 Milestone.

What's Changed

New Contributors

Full Changelog: https://github.com/PyThaiNLP/pythainlp/compare/v4.1.0-beta3...v4.1.0-beta4