How to Deploy an SBERT model? #245

ar3717 · 2020-05-23T04:15:58Z

Hi, I am building a semantic search application and I want to deploy (put into production) my fine-tuned domain-adapted SBERT model. Any idea/recommendations for doing that?

nreimers · 2020-05-23T04:47:41Z

Do you need any specific help?

ar3717 · 2020-05-23T04:50:34Z

Yes, so let's say I want to have a server where I can send an http request to it along with a sentence (query) and I want to get back the embedding based on your pre-trained SBERT-NLI model. Do you have any suggestions on how I can do that?

nreimers · 2020-05-23T05:15:56Z

You would need two components: The sentence embeddings service and the index server.

For the sentence embeddings service, I would use fastAPI. It should only be few lines of python code to run sentence transformers there and to return the embedding for a sent sentence.

Then for the corpus you would need to index your sentence embeddings so that you can search them. If you have a small corpus, you could use ElasticSearch.

For larger corpora, I can recommend faiss.

ar3717 · 2020-05-23T05:33:58Z

Thanks a lot. Would you recommend Flask as another alternative for the sentence embedding service?

nreimers · 2020-05-23T05:42:24Z

Flask also works

FantasyCheese · 2020-05-27T13:21:34Z

Hi @nreimers could you kindly elaborate Faiss vs ElasticSearch thing? Like how large are we talking about when Faiss is recommended and what's the problem that ElasticSearch might have? Our backend team have strong AWS background and it would be hard to convince them not to use AWS ElasticSearch Serivce.

nreimers · 2020-05-27T15:02:17Z

Hi @FantasyCheese
The issue is that ES is performing full nearest neighbor search. If you have 1 Million documents (vectors) in your index, the vector of the index is compared against all 1 Million docs. Hence, the runtime is linear with the number of docs in your index.

In your experiments, the latency up to around 100k was OK. But this of course depends on your setup and how time critical your task is.

Faiss on the other hand uses approximate nearest neighbor (ANN) and is able to index the embeddings. There, you can retrieve the results within milli seconds, independent how many vectors you have indexed. Even when you have Million or Billion docs (vectors), you can find the nearest neighbors efficiently.

ANN is something ES is working on since a year, but I think so far it was yet done:
elastic/elasticsearch#42326

FantasyCheese · 2020-05-27T15:12:19Z

@nreimers Wow that was fast and clear! Thanks a lot for your clarification!

pistocop · 2020-05-28T11:25:11Z

Hi @ar3717 ,
I'm actually working on a very similar task and I have a question for you:

When you say

my fine-tuned domain-adapted SBERT model

what do you mean?

You have fine-tuned some BERT-kind model on a domain-specific dataset in an unsupervised manner, and then used this domain-specific-BERT model to train SBERT version using the same datasets used by @nreimers ?

Or you have taken another approach?

Thanks in advance if you will share some information.

ar3717 · 2020-05-28T17:41:59Z

@GuardatiSimone Hi, Yeah, that is what I have done so far basically. Fine tune BERT on my domain and then feed that fine-tuned BERT into SBERT and use one of the training data sets (e.g. NLI or STS) to retrain the SBERT. But I am planing to use my own corpus to retrain my SBERT based on the wikipedia task in @nreimers paper. What is your approach?

pistocop · 2020-05-29T08:36:11Z

Hi @ar3717 and thanks for the reply.

At the moment, I'm using SBERT pretrained to calculate the sentences embeddings, then feed them into a Faiss system and build a BE (FastAPI) to get the top-K neighbors from it.

Although pretrained SBERT is working well, I need to build an embedding system able to adapt to a domain-specific context - in an unsupervised manner.

The approach are you using [1] produce better embeddings? And the global effort [1] is high in computational terms?

[1] BERT finetune on specific corpora + SBERT training on NLI/STS

ar3717 · 2020-05-29T17:53:55Z

it really depends on your data size. So far, I have tried fine tuning BERT on a small domain specific corpus and I have seen some improvements but I think if I increase my corpus size (that is specific to my domain), I will get much better results. The corpus that I used for fine-tuning is around 6MB and it took 15 min on an AWS GPU ml.p3.2xlarge instance for fine-tuning. SBERT NLI training on the finetuned model took like 1.5 hours on NLI data on the same AWS GPU instance. Let me know if that helps.

pistocop · 2020-05-30T16:02:34Z

Hi, many thanks for your sharing, it is very useful for me.

I'm was trying to gather more informations as possible before starting my work, mainly to know more or less if the goal (better embeddings) could be reached, and the possible price range of the machines for the training.

So many thanks @ar3717 for the info and nreimers for the amazing repository.

cabhijith · 2020-06-03T07:10:07Z

@ar3717 How did you fine-tune BERT on your domain-specific dataset and then feed it to SBERT? I guess you used this script ?

ar3717 · 2020-06-03T17:52:39Z

@cabhijith For SBERT, Iused that script. For fine-tuning BERT, I used this one: https://github.com/huggingface/transformers/blob/v2.9.1/examples/language-modeling/run_language_modeling.py. Are you doing the same thing? If so, could you please let me know what your approach is in case it is different from what I am doing?

threefoldo · 2020-06-17T07:21:21Z

@cabhijith You can try Milvus (https://github.com/milvus-io/milvus) or Annoy(https://github.com/spotify/annoy).

jobergum · 2020-09-09T09:46:57Z

https://vespa.ai/ (https://github.com/vespa-engine/vespa) supports fast ANN tensor search/embedding retrieval using HNSW and one can combine regular sparse retrieval with embedding based retrieval in the same query. Our cord19.vespa.ai app uses sentence-bert embeddings for "Related articles" Example https://cord19.vespa.ai/article/58938.

Resources:

ar3717 changed the title ~~Deploy SBERT~~ How to Deploy an SBERT model? May 23, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to Deploy an SBERT model? #245

How to Deploy an SBERT model? #245

ar3717 commented May 23, 2020

nreimers commented May 23, 2020

ar3717 commented May 23, 2020

nreimers commented May 23, 2020

ar3717 commented May 23, 2020

nreimers commented May 23, 2020

FantasyCheese commented May 27, 2020

nreimers commented May 27, 2020 •

edited

Loading

FantasyCheese commented May 27, 2020

pistocop commented May 28, 2020 •

edited

Loading

ar3717 commented May 28, 2020

pistocop commented May 29, 2020

ar3717 commented May 29, 2020

pistocop commented May 30, 2020

cabhijith commented Jun 3, 2020

ar3717 commented Jun 3, 2020

threefoldo commented Jun 17, 2020

jobergum commented Sep 9, 2020

How to Deploy an SBERT model? #245

How to Deploy an SBERT model? #245

Comments

ar3717 commented May 23, 2020

nreimers commented May 23, 2020

ar3717 commented May 23, 2020

nreimers commented May 23, 2020

ar3717 commented May 23, 2020

nreimers commented May 23, 2020

FantasyCheese commented May 27, 2020

nreimers commented May 27, 2020 • edited Loading

FantasyCheese commented May 27, 2020

pistocop commented May 28, 2020 • edited Loading

ar3717 commented May 28, 2020

pistocop commented May 29, 2020

ar3717 commented May 29, 2020

pistocop commented May 30, 2020

cabhijith commented Jun 3, 2020

ar3717 commented Jun 3, 2020

threefoldo commented Jun 17, 2020

jobergum commented Sep 9, 2020

nreimers commented May 27, 2020 •

edited

Loading

pistocop commented May 28, 2020 •

edited

Loading