-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to Deploy an SBERT model? #245
Comments
Do you need any specific help? |
Yes, so let's say I want to have a server where I can send an http request to it along with a sentence (query) and I want to get back the embedding based on your pre-trained SBERT-NLI model. Do you have any suggestions on how I can do that? |
You would need two components: The sentence embeddings service and the index server. For the sentence embeddings service, I would use fastAPI. It should only be few lines of python code to run sentence transformers there and to return the embedding for a sent sentence. Then for the corpus you would need to index your sentence embeddings so that you can search them. If you have a small corpus, you could use ElasticSearch. For larger corpora, I can recommend faiss. |
Thanks a lot. Would you recommend Flask as another alternative for the sentence embedding service? |
Flask also works |
Hi @nreimers could you kindly elaborate Faiss vs ElasticSearch thing? Like how large are we talking about when Faiss is recommended and what's the problem that ElasticSearch might have? Our backend team have strong AWS background and it would be hard to convince them not to use AWS ElasticSearch Serivce. |
Hi @FantasyCheese In your experiments, the latency up to around 100k was OK. But this of course depends on your setup and how time critical your task is. Faiss on the other hand uses approximate nearest neighbor (ANN) and is able to index the embeddings. There, you can retrieve the results within milli seconds, independent how many vectors you have indexed. Even when you have Million or Billion docs (vectors), you can find the nearest neighbors efficiently. ANN is something ES is working on since a year, but I think so far it was yet done: |
@nreimers Wow that was fast and clear! Thanks a lot for your clarification! |
Hi @ar3717 , When you say
what do you mean? You have fine-tuned some BERT-kind model on a domain-specific dataset in an unsupervised manner, and then used this domain-specific-BERT model to train SBERT version using the same datasets used by @nreimers ? Or you have taken another approach? Thanks in advance if you will share some information. |
@GuardatiSimone Hi, Yeah, that is what I have done so far basically. Fine tune BERT on my domain and then feed that fine-tuned BERT into SBERT and use one of the training data sets (e.g. NLI or STS) to retrain the SBERT. But I am planing to use my own corpus to retrain my SBERT based on the wikipedia task in @nreimers paper. What is your approach? |
Hi @ar3717 and thanks for the reply. At the moment, I'm using SBERT pretrained to calculate the sentences embeddings, then feed them into a Faiss system and build a BE (FastAPI) to get the top-K neighbors from it. Although pretrained SBERT is working well, I need to build an embedding system able to adapt to a domain-specific context - in an unsupervised manner. The approach are you using [1] produce better embeddings? And the global effort [1] is high in computational terms? [1] BERT finetune on specific corpora + SBERT training on NLI/STS |
it really depends on your data size. So far, I have tried fine tuning BERT on a small domain specific corpus and I have seen some improvements but I think if I increase my corpus size (that is specific to my domain), I will get much better results. The corpus that I used for fine-tuning is around 6MB and it took 15 min on an AWS GPU ml.p3.2xlarge instance for fine-tuning. SBERT NLI training on the finetuned model took like 1.5 hours on NLI data on the same AWS GPU instance. Let me know if that helps. |
Hi, many thanks for your sharing, it is very useful for me. I'm was trying to gather more informations as possible before starting my work, mainly to know more or less if the goal (better embeddings) could be reached, and the possible price range of the machines for the training. So many thanks @ar3717 for the info and nreimers for the amazing repository. |
@cabhijith For SBERT, Iused that script. For fine-tuning BERT, I used this one: https://github.com/huggingface/transformers/blob/v2.9.1/examples/language-modeling/run_language_modeling.py. Are you doing the same thing? If so, could you please let me know what your approach is in case it is different from what I am doing? |
@cabhijith You can try Milvus (https://github.com/milvus-io/milvus) or Annoy(https://github.com/spotify/annoy). |
https://vespa.ai/ (https://github.com/vespa-engine/vespa) supports fast ANN tensor search/embedding retrieval using HNSW and one can combine regular sparse retrieval with embedding based retrieval in the same query. Our cord19.vespa.ai app uses sentence-bert embeddings for "Related articles" Example https://cord19.vespa.ai/article/58938. Resources: |
Hi, I am building a semantic search application and I want to deploy (put into production) my fine-tuned domain-adapted SBERT model. Any idea/recommendations for doing that?
The text was updated successfully, but these errors were encountered: