CrossEncoder gives OSError with newest Transformers version #3129

Pringled · 2024-12-10T18:11:29Z

Hi 👋 !

I noticed that with the newest version of Transformers, CrossEncoder seems to break. On the previous version it works fine, but on 4.47.0 I get the following error:

OSError: Can't load the model for 'cross-encoder/ms-marco-MiniLM-L-6-v2'. If you were trying to load it from 'https://huggingface.co/models', make sure you don't have a local directory with the same name. Otherwise, make sure 'cross-encoder/ms-marco-MiniLM-L-6-v2' is the correct path to a directory containing a file named pytorch_model.bin, tf_model.h5, model.ckpt or flax_model.msgpack.

Environment

sentence-transformers==3.3.1
transformers==4.47.0

Using an M3 MacBook Pro.

Steps to reproduce

The following code gives the error:

from sentence_transformers import CrossEncoder
    
model = CrossEncoder("cross-encoder/ms-marco-MiniLM-L-6-v2")

NOTE: Downgrading to transformers==4.46.3 fixes it.

The text was updated successfully, but these errors were encountered:

tomaarsen · 2024-12-12T11:07:49Z

Hello!

This seems to be related to multiprocessing used to convert the model from pytorch_model.bin to model.safetensors. Why that's done via multiprocessing - I'm not sure.

Either way, the common fix for this multiprocessing issue is to wrap your CrossEncoder loading under a if __name__ == "__name__":

from sentence_transformers import CrossEncoder

def main():
    model = CrossEncoder("cross-encoder/ms-marco-MiniLM-L-6-v2")
    query = "How many people live in Berlin?"
    docs = ["Berlin is the capital of Germany", "Berlin has a population of 3.6 million people", "Berlin's area is 891.8 km²"]

    # Compute similarity between the query and the documents
    scores = model.predict([(query, doc) for doc in docs])

    print("Query:", query)
    for doc, score in zip(docs, scores):
        print("Score:", score, "Doc:", doc)

if __name__ == "__main__":
    main()

I'll try and get a proper fix as well, but it'll probably have to be in transformers.

Tom Aarsen

tomaarsen · 2024-12-12T11:20:04Z

Made an issue here: huggingface/transformers#35228

tomaarsen · 2024-12-12T11:35:46Z

I've also added model.safetensors files to each of the original CrossEncoders, which means you shouldn't have this issue again with cross-encoder/ms-marco-MiniLM-L-6-v2.

Tom Aarsen

ydshieh · 2024-12-12T11:50:02Z

Thanks @tomaarsen . I am working on it

Pringled · 2024-12-12T11:51:12Z

Hey @tomaarsen, great, thanks for the quick response! It works now for cross-encoder/ms-marco-MiniLM-L-6-v2 with the latest Transformers version so that solves my issue. I'll use the main wrapper solution if I need a different model. I'll leave this open for now if that's alright, and close it once the issue has been resolved in Transformers.

tomaarsen mentioned this issue Dec 12, 2024

Fix flaky test execution caused by Thread huggingface/transformers#34966

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CrossEncoder gives OSError with newest Transformers version #3129

CrossEncoder gives OSError with newest Transformers version #3129

Pringled commented Dec 10, 2024

tomaarsen commented Dec 12, 2024

tomaarsen commented Dec 12, 2024

tomaarsen commented Dec 12, 2024

ydshieh commented Dec 12, 2024

Pringled commented Dec 12, 2024 •

edited

Loading

CrossEncoder gives OSError with newest Transformers version #3129

CrossEncoder gives OSError with newest Transformers version #3129

Comments

Pringled commented Dec 10, 2024

Environment

Steps to reproduce

tomaarsen commented Dec 12, 2024

tomaarsen commented Dec 12, 2024

tomaarsen commented Dec 12, 2024

ydshieh commented Dec 12, 2024

Pringled commented Dec 12, 2024 • edited Loading

Pringled commented Dec 12, 2024 •

edited

Loading