Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VLLM batch inference example is broken #536

Closed
azgo14 opened this issue Jan 14, 2024 · 5 comments
Closed

VLLM batch inference example is broken #536

azgo14 opened this issue Jan 14, 2024 · 5 comments
Labels
bug vLLM Things involving vLLM support

Comments

@azgo14
Copy link

azgo14 commented Jan 14, 2024

Describe the issue as clearly as possible:

The batch inference vllm examples don't run due to AttributeError returning.

In vllm 0.2.7 (also same w/ 0.2.6) there seems to be two interfaces. One is vllm.LLM and the other is the AsyncLLMEngine. Only AsyncLLMEngine has a tokenizer accessor which RegexLogitsProcessor and CFGLogitsProcessor assume.

To get batch inference to work I modify both of the above processors with something like this:

-        fsm = RegexFSM(regex_string, llm.tokenizer)
+        fsm = RegexFSM(regex_string, _adapt_tokenizer(llm.get_tokenizer()))

Steps/code to reproduce the bug:

Run `python examples/vllm_integration.py`

Expected result:

Normal execution, not a RuntimeError on accessor doesn't exist

Error message:

AttributeError: 'LLM' object has no attribute 'tokenizer'. Did you mean: 'get_tokenizer'?

Outlines/Python version information:

Head version of outlines. python3.10

Context for the issue:

No response

@rlouf
Copy link
Member

rlouf commented Jan 16, 2024

We reverted a recent change that modified the vLLM integration. Could you retry this on the current version of main?

@rlouf rlouf added the vLLM Things involving vLLM support label Jan 16, 2024
@rlouf rlouf closed this as completed Jan 18, 2024
@sethkimmel3
Copy link

sethkimmel3 commented Feb 7, 2024

Hey @rlouf - this still appears to be broken. I'm getting AttributeError: 'LLM' object has no attribute 'tokenizer' error using the JSONLogitsProcessor. Using: vllm==0.3.0 and outlines==0.0.27.

Have you merged the above change into the most recent release?

@sethkimmel3
Copy link

sethkimmel3 commented Feb 8, 2024

I got this to work by explicitly calling the adapt_tokenizer method found here. However, I'm pretty confused why this is necessary given that the JSONLogitsProcessor seems to instantiate the RegexLogitsProcessor.

I also needed to include the _patched_apply_logits_processors, as shown in this issue comment.

@lapp0
Copy link
Contributor

lapp0 commented Feb 8, 2024

@sethkimmel3 could you please create an issue and provide a reproduction script? I think the docstring below is inaccurate, it actually seems to take an _AsyncLLMEngine

https://github.com/outlines-dev/outlines/blob/7fae436345e621a955e1e6ea610f74cf59f9466f/outlines/serve/vllm.py#L42-L53

@sethkimmel3
Copy link

Done @lapp0 - #624

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug vLLM Things involving vLLM support
Projects
None yet
Development

No branches or pull requests

4 participants