Finally managed to run Qwen models successfully with Outlines #514
aalyousfi
started this conversation in
Show and tell
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I've used many models with Outlines without problems. However, I've been unable to use Qwen successfully. Running this simple example:
gives this error:
It was frustrating because I could use Qwen models with
transformers
library (i.e. directly, without Outlines) with no issues. Anyway, after some research, I managed to get past this error by running:But then I get another error:
I managed to solve this error by adding
"fp32": True
tomodel_kwargs
in the code above. After that, the model runs without issues. So just wanted to share this here for anyone facing the same issue and maybe it's useful to enhance Qwen support with Outlines.Note: This works for Transformers. For AWQ, you need to also add
"bf16":False
tomodel_kwargs
which is by defaultTrue
.However, it becomes slower since we are using 32 bits instead of 16 and I get a warning saying:
So that made me wonder if there's a way to use models with
bf16=True
with Outlines? I think the performance boost is significant.Beta Was this translation helpful? Give feedback.
All reactions