Experiencing https://github.com/huggingface/tokenizers/issues/537 issue when sentence-transformer is used for generating embeddings #794

MyBruso · 2021-03-05T05:03:00Z

Hello,
I am using sentence-transformer to get the text embeddings using SentenceTransformer.encode().
This function is getting invoked from multi-threaded program.
I see sometimes this encode method fails with 'Already Borrowed' exception.
The issue is same or similar to huggingface/tokenizers#537.
My sentence-transformer version is 0.4.1.2.

Which version of sentence-transformer will not have this issue? Or what is the alternative to resolve this issue?

nreimers · 2021-03-05T08:10:15Z

Sentence-Transformer is using the huggingface code. Check the linked issue for solutions how to solve it:

Multi-threading does in most cases not make sense in Python (due to global interpreter lock). It is better to design a multi-process architecture in Python
If you still stick to multi-threading, ensure that each thread has its own SentenceTransformer object. Sharing the object between threads leads to the linked issues
You can also try to not use the Rust based fast tokenizer from Huggingface, but to use slow Python based tokenizer:

In the model you are using, there is a folder 0_Transformer, which contains a tokenizer_config.json. Add in that json a new entry: "use_fast": false

This should load the slower Python tokenizer.

MyBruso · 2021-03-05T09:30:39Z

Thank you @nreimers, I will try using "use_fast": false in the tokenizer_config.json.

Multi-threading consumer can certainly be looked into; however I do not have control over there.

In my service, a single model is used for generating embeddings; so I load it once and allow to serve embeddings from the same model. Some of the clients are multi-threaded and there it is causing the issue.

aphedges · 2021-04-09T03:06:45Z

I am in a similar situation to @MyBruso, and I am encountering the same problem.

@nreimers' suggestion of setting "use_fast": false in tokenizer_config.json did not work for me, unfortunately. The code to call the fast tokenizer is still being used. I will try to load a separate SentenceTransformers object in each thread, or try to see if I can figure out why your suggested config fix isn't working. I'll update here if I find anything useful.

HariWu1995 · 2021-04-09T07:02:06Z

@nreimers Excuse me, I have tried use_fast = false as you suggest but the error still happens.
But it's some kind of weird: When I train with tensorflow.keras.losses.categorical_crossentropy, it runs without error. The bug only happens when I use my customized loss functions as below

def weighted_cross_entropy(y_true, y_pred, pos_weight=1.618):
losses = y_true * -K.log(y_pred) * pos_weight + (1-y_true) * -K.log(1-y_pred)
losses = K.clip(losses, 0.0, 9.7)
return K.mean(losses)

or

def categorical_focal_loss(n_classes, alpha: float=0.25, gamma: float=2.):
alpha = np.array([[alpha]*n_classes], dtype=np.float32)
def categorical_focal_loss_fixed(y_true, y_pred):
# Clip the prediction value to prevent NaN's and Inf's
epsilon = K.epsilon()
y_pred = K.clip(y_pred, epsilon, 1.-epsilon)
# Calculate Cross Entropy
cross_entropy = -y_true * K.log(y_pred)
# Calculate Focal Loss
loss = alpha * K.pow(1-y_pred, gamma) * cross_entropy
# Compute mean loss in mini_batch
return K.mean(K.sum(loss, axis=-1))
return categorical_focal_loss_fixed

I have used no library other than tensorflow and keras, so I have no idea what's wrong!

aphedges · 2021-04-10T19:06:46Z

I spent some time inspecting the code, and I figured out a solution similar to the one @nreimers suggested but actually works for me: add "tokenizer_args": {"use_fast": false} to sentence_bert_config.json in 0_Transformer. The problem is that transformers decides to use the fast or slow tokenizer before it even reads tokenizer_config.json, so adding the configuration in there doesn't help.

For context, I am using sentence-transformers==1.0.3 and transformers==4.4.2. The model was trained with sentence-transformers==0.3.6 and transformers<=3.3.1 (the exact transformers version used is not documented).

nreimers · 2021-04-11T11:43:21Z

@aphedges Great, thanks for posting the solution here

aphedges · 2021-04-12T17:18:30Z

You're welcome! I probably wouldn't have figured it out without your initial comment, so thank you.

I hope transformers/tokenizers fixes the underlying bug at some point, but this fix should be sufficient for now.

MyBruso · 2021-04-14T06:09:22Z

Thank you for sharing your findings @aphedges. By the way I am using pretrained distilbert-base-nli-stsb-mean-tokens model from sentence-transformer. I will try adding "tokenizer_args": {"use_fast": false} to sentence_bert_config.json.

MyBruso · 2021-04-20T15:39:26Z

Thank you @nreimers and @aphedges. I am not experiencing this 'Already borrowed' error now.

Since we have touched this topic of using 'SentenceTransformer' object from multithreaded program, I am checking this here, is there any way to improve the encode() ( get embeddings) API's response time when used from a multithreaded server?

I have tried using separate 'SentenceTransformer' object in each thread, however it decreases the per request response time as loading time of model becomes overhead for each request.
I have seen drastic improvement in response time when the server is single threaded; which is not an option as it adds to overall response time at client side because request starts piling up.

nreimers · 2021-04-20T18:01:38Z

Hi @MyBruso
Have a look at these two articles:
https://huggingface.co/blog/bert-cpu-scaling-part-1
https://robloxtechblog.com/how-we-scaled-bert-to-serve-1-billion-daily-requests-on-cpus-d99be090db26

Instead of threads you should use processes. The process should have the model loaded. For each process, you should limit the number of threads the process can use.

jkhalsa-arabesque · 2021-05-20T20:02:35Z

any news?

MyBruso · 2021-05-21T05:36:22Z

any news?

@jkhalsa-arabesque I am assuming you are curious to know if I could use model in multi-process environment. The answer to that is there is a limitation to use multi-processed server at my end.

jkhalsa-arabesque · 2021-05-21T10:03:23Z

any news?

@jkhalsa-arabesque I am assuming you are curious to know if I could use model in multi-process environment. The answer to that is there is a limitation to use multi-processed server at my end.

I was actually curious if the underlying issue has been resolved.
With previous versions of tranformers and tokenizers I didn't have this problem. When I am using cuda, it just seg faults

MyBruso · 2021-05-21T12:16:50Z

any news?

@jkhalsa-arabesque I am assuming you are curious to know if I could use model in multi-process environment. The answer to that is there is a limitation to use multi-processed server at my end.

I was actually curious if the underlying issue has been resolved.
With previous versions of tranformers and tokenizers I didn't have this problem. When I am using cuda, it just seg faults

If you are referring to 'Already borrowed' exception issue; I am not observing it and solution is as per this comment above.

If it is about whether it has been fixed in transformer library; I do not have any latest updates on it.

donderom · 2021-07-19T14:00:33Z

If pretrained model's name is known it could be done programmatically:

from sentence_transformers import SentenceTransformer
from transformers import AutoTokenizer

model_name = "paraphrase-distilroberta-base-v2"
model = SentenceTransformer(model_name)
model.tokenizer = AutoTokenizer.from_pretrained(f"sentence-transformers/{model_name}", use_fast=False)

aphedges · 2021-08-10T19:12:42Z

Recently, there was a fix in transformers (huggingface/transformers#12550) that greatly reduced but not completely eliminated this error. The PR says that it's impossible to fully prevent the error because of how Python-Rust integration works. It should be in v4.9.0 or later.

The only way to truly prevent it is to not use the Rust-based tokenizer, but the workarounds that have been posted in this thread are annoying to use. I have just created a PR (#1103) to make disabling the fast tokenizer easier. If my PR is merged, then using something as simple as SentenceTransformer("sentence-transformers/stsb-roberta-base-v2", tokenizer_args={"use_fast": False}) would be able to disable it.

CaptXiong · 2021-12-03T01:42:40Z

Recently, there was a fix in transformers (huggingface/transformers#12550) that greatly reduced but not completely eliminated this error. The PR says that it's impossible to fully prevent the error because of how Python-Rust integration works. It should be in v4.9.0 or later.

The only way to truly prevent it is to not use the Rust-based tokenizer, but the workarounds that have been posted in this thread are annoying to use. I have just created a PR (#1103) to make disabling the fast tokenizer easier. If my PR is merged, then using something as simple as SentenceTransformer("sentence-transformers/stsb-roberta-base-v2", tokenizer_args={"use_fast": False}) would be able to disable it.

I am just curious that which one is more fast, the way to disable the fast tokenizer or the way in transformers to reduce borrowed situation?

aphedges · 2021-12-03T01:51:46Z

@CaptXiong, I'm not sure what you are asking. Whether it is faster to use the slow tokenizer (by disabling the fast tokenizer) or use the fast tokenizer (by using the fix that reduces the borrowing problem)? I would hope the faster tokenizer would still be much faster, but I have not benchmarked it myself.

CaptXiong · 2021-12-06T02:39:48Z

@aphedges Thank you! I referenced your code to rewrite the SentenceTransformer class and the bug didn't come out again.
And I have test the speed of fast tokenizer and disabled tokenizer, there are not much difference between them for one sentence encoding.

gsakkis · 2023-12-18T14:48:51Z

Is there still interest in this? If so I can post a PR fixing it by storing the tokenizer in thread-local storage.

gionapaolini mentioned this issue May 5, 2021

[Duplicated] RuntimeError: Already borrowed #910

Closed

MichaelBitard mentioned this issue Jul 2, 2021

RuntimeError: Already borrowed deepset-ai/haystack#1229

Closed

aphedges mentioned this issue Aug 10, 2021

Pass model and tokenizer arguments when instantiating SentenceTransformer #1103

Closed

danielbichuetti mentioned this issue Sep 10, 2022

Concurrency problems in Pipelines deepset-ai/haystack#3093

Closed

staoxiao mentioned this issue Apr 19, 2024

RuntimeError: Already borrowed FlagOpen/FlagEmbedding#703

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Experiencing https://github.com/huggingface/tokenizers/issues/537 issue when sentence-transformer is used for generating embeddings #794

Experiencing https://github.com/huggingface/tokenizers/issues/537 issue when sentence-transformer is used for generating embeddings #794

MyBruso commented Mar 5, 2021

nreimers commented Mar 5, 2021

MyBruso commented Mar 5, 2021

aphedges commented Apr 9, 2021

HariWu1995 commented Apr 9, 2021

aphedges commented Apr 10, 2021

nreimers commented Apr 11, 2021

aphedges commented Apr 12, 2021

MyBruso commented Apr 14, 2021

MyBruso commented Apr 20, 2021

nreimers commented Apr 20, 2021

jkhalsa-arabesque commented May 20, 2021

MyBruso commented May 21, 2021

jkhalsa-arabesque commented May 21, 2021

MyBruso commented May 21, 2021 •

edited

Loading

donderom commented Jul 19, 2021

aphedges commented Aug 10, 2021

CaptXiong commented Dec 3, 2021

aphedges commented Dec 3, 2021

CaptXiong commented Dec 6, 2021 •

edited

Loading

gsakkis commented Dec 18, 2023

Experiencing https://github.com/huggingface/tokenizers/issues/537 issue when sentence-transformer is used for generating embeddings #794

Experiencing https://github.com/huggingface/tokenizers/issues/537 issue when sentence-transformer is used for generating embeddings #794

Comments

MyBruso commented Mar 5, 2021

nreimers commented Mar 5, 2021

MyBruso commented Mar 5, 2021

aphedges commented Apr 9, 2021

HariWu1995 commented Apr 9, 2021

aphedges commented Apr 10, 2021

nreimers commented Apr 11, 2021

aphedges commented Apr 12, 2021

MyBruso commented Apr 14, 2021

MyBruso commented Apr 20, 2021

nreimers commented Apr 20, 2021

jkhalsa-arabesque commented May 20, 2021

MyBruso commented May 21, 2021

jkhalsa-arabesque commented May 21, 2021

MyBruso commented May 21, 2021 • edited Loading

donderom commented Jul 19, 2021

aphedges commented Aug 10, 2021

CaptXiong commented Dec 3, 2021

aphedges commented Dec 3, 2021

CaptXiong commented Dec 6, 2021 • edited Loading

gsakkis commented Dec 18, 2023

MyBruso commented May 21, 2021 •

edited

Loading

CaptXiong commented Dec 6, 2021 •

edited

Loading