Skip to content

Commit

Permalink
Bump version to 5.4.0 [skip test]
Browse files Browse the repository at this point in the history
  • Loading branch information
maziyarpanahi committed Jun 26, 2024
1 parent ac9de09 commit e88682c
Show file tree
Hide file tree
Showing 17 changed files with 140 additions and 161 deletions.
107 changes: 48 additions & 59 deletions README.md

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion build.sbt
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ name := getPackageName(is_silicon, is_gpu, is_aarch64)

organization := "com.johnsnowlabs.nlp"

version := "5.4.0-rc2"
version := "5.4.0"

(ThisBuild / scalaVersion) := scalaVer

Expand Down
2 changes: 1 addition & 1 deletion docs/_layouts/landing.html
Original file line number Diff line number Diff line change
Expand Up @@ -201,7 +201,7 @@ <h3 class="grey h3_title">{{ _section.title }}</h3>
<div class="highlight-box">
{% highlight bash %}
# Using PyPI
$ pip install spark-nlp==5.4.0-rc2
$ pip install spark-nlp==5.4.0

# Using Anaconda/Conda
$ conda install -c johnsnowlabs spark-nlp
Expand Down
2 changes: 1 addition & 1 deletion docs/en/concepts.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@ $ java -version
$ conda create -n sparknlp python=3.7 -y
$ conda activate sparknlp
# spark-nlp by default is based on pyspark 3.x
$ pip install spark-nlp==5.4.0-rc2 pyspark==3.3.1 jupyter
$ pip install spark-nlp==5.4.0 pyspark==3.3.1 jupyter
$ jupyter notebook
```

Expand Down
4 changes: 2 additions & 2 deletions docs/en/examples.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ $ java -version
# should be Java 8 (Oracle or OpenJDK)
$ conda create -n sparknlp python=3.7 -y
$ conda activate sparknlp
$ pip install spark-nlp==5.4.0-rc2 pyspark==3.3.1
$ pip install spark-nlp==5.4.0 pyspark==3.3.1
```

</div><div class="h3-box" markdown="1">
Expand All @@ -40,7 +40,7 @@ This script comes with the two options to define `pyspark` and `spark-nlp` versi
# -p is for pyspark
# -s is for spark-nlp
# by default they are set to the latest
!bash colab.sh -p 3.2.3 -s 5.4.0-rc2
!bash colab.sh -p 3.2.3 -s 5.4.0
```

[Spark NLP quick start on Google Colab](https://colab.research.google.com/github/JohnSnowLabs/spark-nlp/blob/master/examples/python/quick_start_google_colab.ipynb) is a live demo on Google Colab that performs named entity recognitions and sentiment analysis by using Spark NLP pretrained pipelines.
Expand Down
2 changes: 1 addition & 1 deletion docs/en/hardware_acceleration.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ Since the new Transformer models such as BERT for Word and Sentence embeddings a
| DeBERTa Large | +477%(5.8x) |
| Longformer Base | +52%(1.5x) |

Spark NLP 5.4.0-rc2 is built with TensorFlow 2.7.1 and the following NVIDIA® software are only required for GPU support:
Spark NLP 5.4.0 is built with TensorFlow 2.7.1 and the following NVIDIA® software are only required for GPU support:

- NVIDIA® GPU drivers version 450.80.02 or higher
- CUDA® Toolkit 11.2
Expand Down
54 changes: 27 additions & 27 deletions docs/en/install.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,22 +17,22 @@ sidebar:

```bash
# Install Spark NLP from PyPI
pip install spark-nlp==5.4.0-rc2
pip install spark-nlp==5.4.0

# Install Spark NLP from Anaconda/Conda
conda install -c johnsnowlabs spark-nlp

# Load Spark NLP with Spark Shell
spark-shell --packages com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0-rc2
spark-shell --packages com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0

# Load Spark NLP with PySpark
pyspark --packages com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0-rc2
pyspark --packages com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0

# Load Spark NLP with Spark Submit
spark-submit --packages com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0-rc2
spark-submit --packages com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0

# Load Spark NLP as external JAR after compiling and building Spark NLP by `sbt assembly`
spark-shell --jars spark-nlp-assembly-5.4.0-rc2.jar
spark-shell --jars spark-nlp-assembly-5.4.0.jar
```

</div><div class="h3-box" markdown="1">
Expand All @@ -55,7 +55,7 @@ $ java -version
# should be Java 8 (Oracle or OpenJDK)
$ conda create -n sparknlp python=3.8 -y
$ conda activate sparknlp
$ pip install spark-nlp==5.4.0-rc2 pyspark==3.3.1
$ pip install spark-nlp==5.4.0 pyspark==3.3.1
```

Of course you will need to have jupyter installed in your system:
Expand Down Expand Up @@ -92,7 +92,7 @@ spark = SparkSession.builder \
.config("spark.serializer", "org.apache.spark.serializer.KryoSerializer") \
.config("spark.kryoserializer.buffer.max", "2000M") \
.config("spark.driver.maxResultSize", "0") \
.config("spark.jars.packages", "com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0-rc2") \
.config("spark.jars.packages", "com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0") \
.getOrCreate()
```

Expand All @@ -109,7 +109,7 @@ spark = SparkSession.builder \
<dependency>
<groupId>com.johnsnowlabs.nlp</groupId>
<artifactId>spark-nlp_2.12</artifactId>
<version>5.4.0-rc2</version>
<version>5.4.0</version>
</dependency>
```

Expand All @@ -120,7 +120,7 @@ spark = SparkSession.builder \
<dependency>
<groupId>com.johnsnowlabs.nlp</groupId>
<artifactId>spark-nlp-gpu_2.12</artifactId>
<version>5.4.0-rc2</version>
<version>5.4.0</version>
</dependency>
```

Expand All @@ -131,7 +131,7 @@ spark = SparkSession.builder \
<dependency>
<groupId>com.johnsnowlabs.nlp</groupId>
<artifactId>spark-nlp-silicon_2.12</artifactId>
<version>5.4.0-rc2</version>
<version>5.4.0</version>
</dependency>
```

Expand All @@ -142,7 +142,7 @@ spark = SparkSession.builder \
<dependency>
<groupId>com.johnsnowlabs.nlp</groupId>
<artifactId>spark-nlp-aarch64_2.12</artifactId>
<version>5.4.0-rc2</version>
<version>5.4.0</version>
</dependency>
```

Expand All @@ -154,28 +154,28 @@ spark = SparkSession.builder \

```scala
// https://mvnrepository.com/artifact/com.johnsnowlabs.nlp/spark-nlp
libraryDependencies += "com.johnsnowlabs.nlp" %% "spark-nlp" % "5.4.0-rc2"
libraryDependencies += "com.johnsnowlabs.nlp" %% "spark-nlp" % "5.4.0"
```

**spark-nlp-gpu:**

```scala
// https://mvnrepository.com/artifact/com.johnsnowlabs.nlp/spark-nlp-gpu
libraryDependencies += "com.johnsnowlabs.nlp" %% "spark-nlp-gpu" % "5.4.0-rc2"
libraryDependencies += "com.johnsnowlabs.nlp" %% "spark-nlp-gpu" % "5.4.0"
```

**spark-nlp-silicon:**

```scala
// https://mvnrepository.com/artifact/com.johnsnowlabs.nlp/spark-nlp-silicon
libraryDependencies += "com.johnsnowlabs.nlp" %% "spark-nlp-silicon" % "5.4.0-rc2"
libraryDependencies += "com.johnsnowlabs.nlp" %% "spark-nlp-silicon" % "5.4.0"
```

**spark-nlp-aarch64:**

```scala
// https://mvnrepository.com/artifact/com.johnsnowlabs.nlp/spark-nlp-aarch64
libraryDependencies += "com.johnsnowlabs.nlp" %% "spark-nlp-aarch64" % "5.4.0-rc2"
libraryDependencies += "com.johnsnowlabs.nlp" %% "spark-nlp-aarch64" % "5.4.0"
```

Maven Central: [https://mvnrepository.com/artifact/com.johnsnowlabs.nlp](https://mvnrepository.com/artifact/com.johnsnowlabs.nlp)
Expand Down Expand Up @@ -257,15 +257,15 @@ maven coordinates like these:
<dependency>
<groupId>com.johnsnowlabs.nlp</groupId>
<artifactId>spark-nlp-silicon_2.12</artifactId>
<version>5.4.0-rc2</version>
<version>5.4.0</version>
</dependency>
```

or in case of sbt:

```scala
// https://mvnrepository.com/artifact/com.johnsnowlabs.nlp/spark-nlp
libraryDependencies += "com.johnsnowlabs.nlp" %% "spark-nlp-silicon" % "5.4.0-rc2"
libraryDependencies += "com.johnsnowlabs.nlp" %% "spark-nlp-silicon" % "5.4.0"
```

If everything went well, you can now start Spark NLP with the `m1` flag set to `true`:
Expand Down Expand Up @@ -302,7 +302,7 @@ spark = sparknlp.start(apple_silicon=True)

## Installation for Linux Aarch64 Systems

Starting from version 5.4.0-rc2, Spark NLP supports Linux systems running on an aarch64
Starting from version 5.4.0, Spark NLP supports Linux systems running on an aarch64
processor architecture. The necessary dependencies have been built on Ubuntu 16.04, so a
recent system with an environment of at least that will be needed.

Expand Down Expand Up @@ -350,7 +350,7 @@ This script comes with the two options to define `pyspark` and `spark-nlp` versi
# -p is for pyspark
# -s is for spark-nlp
# by default they are set to the latest
!wget http://setup.johnsnowlabs.com/colab.sh -O - | bash /dev/stdin -p 3.2.3 -s 5.4.0-rc2
!wget http://setup.johnsnowlabs.com/colab.sh -O - | bash /dev/stdin -p 3.2.3 -s 5.4.0
```

[Spark NLP quick start on Google Colab](https://colab.research.google.com/github/JohnSnowLabs/spark-nlp/blob/master/examples/python/quick_start_google_colab.ipynb) is a live demo on Google Colab that performs named entity recognitions and sentiment analysis by using Spark NLP pretrained pipelines.
Expand All @@ -372,7 +372,7 @@ Run the following code in Kaggle Kernel and start using spark-nlp right away.

## Databricks Support

Spark NLP 5.4.0-rc2 has been tested and is compatible with the following runtimes:
Spark NLP 5.4.0 has been tested and is compatible with the following runtimes:

**CPU:**

Expand Down Expand Up @@ -454,7 +454,7 @@ Spark NLP 5.4.0-rc2 has been tested and is compatible with the following runtime
3.1. Install New -> PyPI -> `spark-nlp` -> Install
3.2. Install New -> Maven -> Coordinates -> `com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0-rc2` -> Install
3.2. Install New -> Maven -> Coordinates -> `com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0` -> Install
4. Now you can attach your notebook to the cluster and use Spark NLP!
Expand All @@ -474,7 +474,7 @@ Note: You can import these notebooks by using their URLs.

## EMR Support

Spark NLP 5.4.0-rc2 has been tested and is compatible with the following EMR releases:
Spark NLP 5.4.0 has been tested and is compatible with the following EMR releases:

- emr-6.2.0
- emr-6.3.0
Expand Down Expand Up @@ -537,7 +537,7 @@ A sample of your software configuration in JSON on S3 (must be public access):
"spark.kryoserializer.buffer.max": "2000M",
"spark.serializer": "org.apache.spark.serializer.KryoSerializer",
"spark.driver.maxResultSize": "0",
"spark.jars.packages": "com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0-rc2"
"spark.jars.packages": "com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0"
}
}
]
Expand All @@ -547,7 +547,7 @@ A sample of AWS CLI to launch EMR cluster:

```sh
aws emr create-cluster \
--name "Spark NLP 5.4.0-rc2" \
--name "Spark NLP 5.4.0" \
--release-label emr-6.2.0 \
--applications Name=Hadoop Name=Spark Name=Hive \
--instance-type m4.4xlarge \
Expand Down Expand Up @@ -812,7 +812,7 @@ We recommend using `conda` to manage your Python environment on Windows.
Now you can use the downloaded binary by navigating to `%SPARK_HOME%\bin` and
running
Either create a conda env for python 3.6, install *pyspark==3.3.1 spark-nlp numpy* and use Jupyter/python console, or in the same conda env you can go to spark bin for *pyspark --packages com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0-rc2*.
Either create a conda env for python 3.6, install *pyspark==3.3.1 spark-nlp numpy* and use Jupyter/python console, or in the same conda env you can go to spark bin for *pyspark --packages com.johnsnowlabs.nlp:spark-nlp_2.12:5.4.0*.
<img class="image image--xl" src="/assets/images/installation/90126972-c03e5500-dd64-11ea-8285-e4f76aa9e543.jpg" style="width:100%; align:center; box-shadow: 0 3px 6px rgba(0,0,0,0.16), 0 3px 6px rgba(0,0,0,0.23);"/>
Expand Down Expand Up @@ -840,12 +840,12 @@ spark = SparkSession.builder \
.config("spark.driver.memory","16G")\
.config("spark.driver.maxResultSize", "0") \
.config("spark.kryoserializer.buffer.max", "2000M")\
.config("spark.jars", "/tmp/spark-nlp-assembly-5.4.0-rc2.jar")\
.config("spark.jars", "/tmp/spark-nlp-assembly-5.4.0.jar")\
.getOrCreate()
```
- You can download provided Fat JARs from each [release notes](https://github.com/JohnSnowLabs/spark-nlp/releases), please pay attention to pick the one that suits your environment depending on the device (CPU/GPU) and Apache Spark version (3.x)
- If you are local, you can load the Fat JAR from your local FileSystem, however, if you are in a cluster setup you need to put the Fat JAR on a distributed FileSystem such as HDFS, DBFS, S3, etc. (i.e., `hdfs:///tmp/spark-nlp-assembly-5.4.0-rc2.jar`)
- If you are local, you can load the Fat JAR from your local FileSystem, however, if you are in a cluster setup you need to put the Fat JAR on a distributed FileSystem such as HDFS, DBFS, S3, etc. (i.e., `hdfs:///tmp/spark-nlp-assembly-5.4.0.jar`)
Example of using pretrained Models and Pipelines in offline:
Expand Down
2 changes: 1 addition & 1 deletion docs/en/spark_nlp.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ Spark NLP is built on top of **Apache Spark 3.x**. For using Spark NLP you need:

**GPU (optional):**

Spark NLP 5.4.0-rc2 is built with TensorFlow 2.7.1 and the following NVIDIA® software are only required for GPU support:
Spark NLP 5.4.0 is built with TensorFlow 2.7.1 and the following NVIDIA® software are only required for GPU support:

- NVIDIA® GPU drivers version 450.80.02 or higher
- CUDA® Toolkit 11.2
Expand Down
Loading

0 comments on commit e88682c

Please sign in to comment.