Skip to content

Releases: mesolitica/malaya

Version 4.0

16 Nov 04:27
Compare
Choose a tag to compare
  1. Added quantized models to all Malaya models, reduce inference time by 2x and model size by 4x.
  2. Retrain constituency parsing, improved accuracy slightly by ~1-2%.
  3. Added vectorization interface for sentence / word level for all classification models.

Version 3.8.1

16 Aug 16:07
Compare
Choose a tag to compare
  1. Released constituency parsing

Version 3.8

05 Aug 18:13
Compare
Choose a tag to compare
  1. Improved spelling correction.
  2. Improved normalizer.
  3. Improved EN-MS translation, now support longer texts and US style texts.

Version 3.7

10 Jul 05:02
Compare
Choose a tag to compare
  1. Added translation EN to MS and MS to EN modules.
  2. Added paraphrase module.
  3. Added keyword extraction module.

Version 3.4

27 Apr 13:53
Compare
Choose a tag to compare
release 3.4

Version 2.7

07 Aug 17:36
Compare
Choose a tag to compare
  1. BERT-Bahasa interface available.
  2. Added BERT-Multilanguage, BERT-Base and BERT-small for emotion analysis.
  3. Added BERT-Multilanguage, BERT-Base and BERT-small for Naming Entity Recognition.
  4. Added BERT-Multilanguage, BERT-Base and BERT-small for Part-Of-Speech.
  5. Added BERT-Multilanguage and BERT-Base for relevancy analysis.
  6. Added BERT-Multilanguage, BERT-Base and BERT-small for sentiment analysis.
  7. Added encoder interface for text similarity, can use skip-thought / BERT / XLNET as encoder model.
  8. Added tree plot visualization for text similarity.
  9. Added BERT-Multilanguage, BERT-Base and BERT-small for subjectivity analysis.
  10. Added encoder interface for text summarization, can use skip-thought / BERT / XLNET as encoder model.
  11. Added BERT / XLNET interface for topic modeling.
  12. Added BERT-Multilanguage, BERT-Base and BERT-small for toxicity analysis.
  13. Remove siamese models for text similarity.
  14. Remove fast-text-char models, replace by BERT model.
  15. Malaya no longer support training interface.
  16. XLNET-Bahasa interface available.
  17. Sequence models now no longer improve by Malaya, we move on using Attention model.

Version 2.6

25 Jun 03:56
Compare
Choose a tag to compare
  1. Added deep siamese network, https://malaya.readthedocs.io/en/latest/Similarity.html#deep-siamese-network.
  2. Added BERT deep siamese network, https://malaya.readthedocs.io/en/latest/Similarity.html#bert-model
  3. Added Doc2Vec to calculate semantic similarity, https://malaya.readthedocs.io/en/latest/Similarity.html#calculate-similarity-using-doc2vec
  4. Now all extractive summarization is use TextRank algorithm as scoring algorithm.
  5. Added Doc2Vec for extractive summarization, https://malaya.readthedocs.io/en/latest/Summarization.html#load-doc2vec-summarization

Version 2.4

01 Jun 05:40
Compare
Choose a tag to compare
  1. Added relevancy analysis, to study an article or a piece of text is relevant, tendency to become a fake news. https://malaya.readthedocs.io/en/latest/Relevancy.html
  2. Added visualization dashboard for emotion analysis, relevancy analysis, sentiment analysis, subjectivity analysis and toxicity analysis. Very easy to use, call predict_words function and it will popup.
  3. Added neutral class for relevancy analysis, sentiment analysis and subjectivity analysis.
  4. Use Malaya preprocessing for all deep learning models classification.

Version 1.9

27 Feb 14:34
Compare
Choose a tag to compare
  1. Fix some english loading bugs
  2. Added clustering visualization, https://malaya.readthedocs.io/en/latest/Cluster.html
  3. Added text augmentation, https://malaya.readthedocs.io/en/latest/Generator.html
  4. Normalizer and Spelling now able to detect english words.

Version 1.7

15 Feb 12:37
Compare
Choose a tag to compare
  1. Added text similarity and released partial topics related, https://malaya.readthedocs.io/en/latest/Similarity.html
  2. Added word-mover distance interface, https://malaya.readthedocs.io/en/latest/Mover.html
  3. Added pretrained fast-text based on wikipedia, https://malaya.readthedocs.io/en/latest/Fasttext.html
  4. Improve sentiment analysis, trained on more than 800k sentences and more sensitive towards social media texts.
  5. Remove n-grams for all fast-text models to reduce dimension curse.
  6. Remove sparse limit for all fast-text-char models to improve n-grams sensitivity.