John Snow Labs Spark-NLP 3.3.3: New DistilBERT for Sequence Classification, new trainable and distributed Doc2Vec, BERT improvements on GPU, new state-of-the-art DistilBERT models for topic and sentiment detection, enhancements, and bug fixes! #6505
maziyarpanahi
announced in
Announcement
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Overview
(knock, knock, knock) Penny? Yes, this is a very special release if you are obsessed with the number
3
as much as we are! So we are pleased to announce Spark NLP 🚀 3.3.3 release! 🎉 🎊 🎈This release comes with a new DistilBertForSequenceClassification annotator for existing or fine-tuned DistilBERT models for Text Classification on HuggingFace, new distributed and trainable Doc2Vec annotator based on Word2Vec implementation in Spark ML, improving BertEmbeddings and BertSentenceEmbeddings on a single machine on a GPU device where the DataFrame has 1 sentence per row or input column is set to document, new state-of-the-art fine-tuned DistilBERT models for Sequence Classification, enhancements, bug fixes, and more!
As always, we would like to thank our community for their feedback, questions, and feature requests.
New Features and Enhancements
DistilBertForSequenceClassification
DistilBertForSequenceClassification can load DistilBERT Models with sequence classification/regression head on top (a linear layer on top of the pooled output) e.g. for multi-class document classification tasks. This annotator is compatible with all the models trained/fine-tuned by usingDistilBertForSequenceClassification
orTFDistilBertForSequenceClassification
in HuggingFace 🤗Bug Fixes
opus_mt_mul_en
andopus_mt_mul_en
lemma_antbnc
model to Models Hubsentiment_vivekn
model to Models Hubspellcheck_norvig
model to Models HubModels
New state-of-the-art fine-tuned DistilBERT models for Sequence Classification:
Featured Pretrained Models
en
3.3.3
en
3.3.3
en
3.3.3
en
3.3.3
3.3.3
fr
3.3.3
ur
3.3.3
en
3.3.3
en
3.3.3
en
3.3.3
en
3.3.3
en
3.3.3
The complete list of all 4000+ models & pipelines in 200+ languages is available on Models Hub.
New Notebooks
Documentation
Installation
Python
#PyPI pip install spark-nlp==3.3.3
Spark Packages
spark-nlp on Apache Spark 3.0.x and 3.1.x (Scala 2.12 only):
GPU
spark-nlp on Apache Spark 2.4.x (Scala 2.11 only):
GPU
spark-nlp on Apache Spark 2.3.x (Scala 2.11 only):
GPU
Maven
spark-nlp on Apache Spark 3.0.x and 3.1.x:
spark-nlp-gpu:
spark-nlp on Apache Spark 2.4.x:
spark-nlp-gpu:
spark-nlp on Apache Spark 2.3.x:
spark-nlp-gpu:
FAT JARs
CPU on Apache Spark 3.x: https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/jars/spark-nlp-assembly-3.3.3.jar
GPU on Apache Spark 3.x: https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/jars/spark-nlp-gpu-assembly-3.3.3.jar
CPU on Apache Spark 2.4.x: https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/jars/spark-nlp-spark24-assembly-3.3.3.jar
GPU on Apache Spark 2.4.x: https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/jars/spark-nlp-gpu-spark24-assembly-3.3.3.jar
CPU on Apache Spark 2.3.x: https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/jars/spark-nlp-spark23-assembly-3.3.3.jar
GPU on Apache Spark 2.3.x: https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/jars/spark-nlp-gpu-spark23-assembly-3.3.3.jar
What's Changed
New Contributors
Full Changelog: 3.3.2...3.3.3
Beta Was this translation helpful? Give feedback.
All reactions