-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
unsloth 4bit models do not load in vLLM - says missing adapter path or name #688
Comments
I am also getting this error, hope a fix comes soon. |
Apologies on the late reply sorry! My bro and I relocated to SF, so just got back to Github issues! On vLLM, you mustn't use the |
Since vLLM v0.5.0 has been released, vLLM does support bnb quantization. Would it be possible for models finetuned and quantized with unsloth to be served with vLLM given the new release? |
@nole69 Are you referring to vllm-project/vllm#4776? I think that's only QLoRA adapters, and not full bnb models. You can try exporting the LoRA adapters, then use vLLM I guess |
@danielhanchen Indeed, I think full bnb models will be supported after vllm-project/vllm#5753 is merged |
I also met this problem, and I used qlora, how to fix it? |
@fengyunflya Sorry vLLM currently doesn't load up bitsandbytes models - I'll try add some code to export directly to vLLM |
Would also love to have this feature, currently also having the same problem with 4bnb models not being able to be loaded in vllm |
Now, bnb models work on vLLM using |
When I try to load an unsloth 4bit model with
llm = LLM("unsloth/mistral-7b-instruct-v0.3-bnb-4bit", dtype="half")
,I get the error
Cannot find any of ['adapter_name_or_path'] in the model's quantization config.
This Is true for all llama3 and gemma models as well. As far as I know, there are no lora adapters attached to the models. Please let me know how to proceed in loading them.
The text was updated successfully, but these errors were encountered: