-
Notifications
You must be signed in to change notification settings - Fork 455
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FEAT: support sparse vector for bge-m3 #2540
Conversation
For For bge-m3, it's too large for CI, we can test it manually. |
OK |
…o main FEAT: add test_convert_ids_to_tokens for convert_ids_to_tokens
About reloading in |
…nvert_ids_to_tokens. Added _encode_bgem3 method for bge-m3 model encoding
…nvert_ids_to_tokens. Added _encode_bgem3 method for bge-m3 model encoding
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
支持bge-m3模型的稀疏向量生成功能,
调用方式为:
model.create_embedding(text, return_sparse=True)
新增convert_ids_to_tokens方法
该方法可将token_id转换为人类可读文字,调用方式为
该方法返回类型为List[str],如传入List[str],将按顺序返回值
Fixes #2527 .