-
Notifications
You must be signed in to change notification settings - Fork 475
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] support multiple lora adaptor #695
Comments
Hi, @comeby |
Hi, @comeby |
The effects of reasoning vary widely between the following two strategies: adapters = {'default':'xx/lora_test'} engine = Engine.from_pretrained(model_path, |
https://modelscope.cn/models/walker31350430a/test_lora/files |
solved ,#1042 |
Motivation
S-LoRA: Serving Thousands of Concurrent LoRA Adapters [paper]
The paper claims that “S-LoRA can improve the throughput by up to 4 times and increase the number of served adapters by several orders of magnitude.”
Support multiple lora adaptor could be transcendental to cost effective LoRA model severing.
will you support this feature?
Related resources
https://github.com/S-LoRA/S-LoRA
Additional context
No response
The text was updated successfully, but these errors were encountered: