Does AddBertOnnxTextEmbeddingGeneration support multi-lingual models like intfloat / multilingual-e5-small #8413
-
Hi, While I dont get an error, I get unexpected results : the most similar results are actually not similar. Should this model work with SK ? I know SK only supports BERT based models, it is my understanding e5-small is based on XML-r which it is based on BERT. LEt me know if I misunderstood or if the model should actually work, it would be great if we could have an example. |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments
-
Hi @gonsss, thanks for the comment, I couldn't test this myself as the multilingual-e5-small repo does not have a If you have a For example this one has and works fine TaylorAI-bge-micro-v2 |
Beta Was this translation helpful? Give feedback.
-
Thanks @RogerBarreto ,
I will carry on digging, but I gather that unless 1) works , SK will have to expand support for additional tokenizers. |
Beta Was this translation helpful? Give feedback.
Hi @gonsss, thanks for the comment, I couldn't test this myself as the multilingual-e5-small repo does not have a
vocab.txt
file which is required.If you have a
vocab.txt
file and can provide or update the repo with it that would be helpful to follow up.For example this one has and works fine TaylorAI-bge-micro-v2