unsloth 4bit models do not load in vLLM - says missing adapter path or name #688

jonberliner · 2024-06-24T17:43:02Z

When I try to load an unsloth 4bit model with
llm = LLM("unsloth/mistral-7b-instruct-v0.3-bnb-4bit", dtype="half"),
I get the error
Cannot find any of ['adapter_name_or_path'] in the model's quantization config.

This Is true for all llama3 and gemma models as well. As far as I know, there are no lora adapters attached to the models. Please let me know how to proceed in loading them.

The text was updated successfully, but these errors were encountered:

hruday-markonda · 2024-06-29T13:40:20Z

I am also getting this error, hope a fix comes soon.

danielhanchen · 2024-07-01T00:12:04Z

Apologies on the late reply sorry! My bro and I relocated to SF, so just got back to Github issues!

On vLLM, you mustn't use the bnb-4bit variants - you must use model.save_pretrained_merged and save it to 16bit for inference - ie only full 16bit models work for vLLM

nole69 · 2024-07-01T21:35:31Z

Apologies on the late reply sorry! My bro and I relocated to SF, so just got back to Github issues!

On vLLM, you mustn't use the bnb-4bit variants - you must use model.save_pretrained_merged and save it to 16bit for inference - ie only full 16bit models work for vLLM

Since vLLM v0.5.0 has been released, vLLM does support bnb quantization. Would it be possible for models finetuned and quantized with unsloth to be served with vLLM given the new release?

danielhanchen · 2024-07-02T05:31:22Z

@nole69 Are you referring to vllm-project/vllm#4776? I think that's only QLoRA adapters, and not full bnb models. You can try exporting the LoRA adapters, then use vLLM I guess

odulcy-mindee · 2024-07-05T15:16:40Z

@danielhanchen Indeed, I think full bnb models will be supported after vllm-project/vllm#5753 is merged

fengyunflya · 2024-07-24T02:29:56Z

I also met this problem, and I used qlora, how to fix it?

danielhanchen · 2024-07-26T06:49:45Z

@fengyunflya Sorry vLLM currently doesn't load up bitsandbytes models - I'll try add some code to export directly to vLLM

YorickdeJong · 2024-08-03T17:34:20Z

Would also love to have this feature, currently also having the same problem with 4bnb models not being able to be loaded in vllm

odulcy-mindee · 2024-08-08T14:07:44Z

Now, bnb models work on vLLM using enforce_eager=True on main, but not supported yet for enforce_eager=False.
See vllm-project/vllm#7294

jonberliner changed the title ~~unsloth base models do not load - says missing adapter path or name~~ unsloth base models do not load in vLLM - says missing adapter path or name Jun 24, 2024

jonberliner changed the title ~~unsloth base models do not load in vLLM - says missing adapter path or name~~ unsloth 4bit models do not load in vLLM - says missing adapter path or name Jun 24, 2024

danielhanchen added the currently fixing Am fixing now! label Jul 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

unsloth 4bit models do not load in vLLM - says missing adapter path or name #688

unsloth 4bit models do not load in vLLM - says missing adapter path or name #688

jonberliner commented Jun 24, 2024

hruday-markonda commented Jun 29, 2024

danielhanchen commented Jul 1, 2024

nole69 commented Jul 1, 2024

danielhanchen commented Jul 2, 2024

odulcy-mindee commented Jul 5, 2024

fengyunflya commented Jul 24, 2024

danielhanchen commented Jul 26, 2024

YorickdeJong commented Aug 3, 2024

odulcy-mindee commented Aug 8, 2024

unsloth 4bit models do not load in vLLM - says missing adapter path or name #688

unsloth 4bit models do not load in vLLM - says missing adapter path or name #688

Comments

jonberliner commented Jun 24, 2024

hruday-markonda commented Jun 29, 2024

danielhanchen commented Jul 1, 2024

nole69 commented Jul 1, 2024

danielhanchen commented Jul 2, 2024

odulcy-mindee commented Jul 5, 2024

fengyunflya commented Jul 24, 2024

danielhanchen commented Jul 26, 2024

YorickdeJong commented Aug 3, 2024

odulcy-mindee commented Aug 8, 2024