Replies: 2 comments
-
I am new to spark ml pipelines. I am facing similar issue of speed wile doing inference. I also found spacy 20 times faster than spark nlp. So to speed up spark nlp should I run it on a bigger spark cluster to distribute tasks better. |
Beta Was this translation helpful? Give feedback.
0 replies
-
Hi, Without them, it's really hard to compare and give any advice. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I am using
spark-nlp
for identifying named entities and tagging the text. I wanted to understand what is the best approach to speed up this process.My text data is around 20 GB with each sentence in a separate line. I am calling
spark-nlp
with chunks of this data. Regardless of the chunk size, the speed is pretty slow compared to Spacy NER. The latter is at least 20 times faster.This makes me believe that I might not be using
spark-nlp
in the best possible way. Are there any best practices to increase the speed when using CPU? Please advise.Beta Was this translation helpful? Give feedback.
All reactions