VLLM batch inference example is broken #536

azgo14 · 2024-01-14T14:13:07Z

Describe the issue as clearly as possible:

The batch inference vllm examples don't run due to AttributeError returning.

In vllm 0.2.7 (also same w/ 0.2.6) there seems to be two interfaces. One is vllm.LLM and the other is the AsyncLLMEngine. Only AsyncLLMEngine has a tokenizer accessor which RegexLogitsProcessor and CFGLogitsProcessor assume.

To get batch inference to work I modify both of the above processors with something like this:

-        fsm = RegexFSM(regex_string, llm.tokenizer)
+        fsm = RegexFSM(regex_string, _adapt_tokenizer(llm.get_tokenizer()))

Steps/code to reproduce the bug:

Run `python examples/vllm_integration.py`

Expected result:

Normal execution, not a RuntimeError on accessor doesn't exist

Error message:

AttributeError: 'LLM' object has no attribute 'tokenizer'. Did you mean: 'get_tokenizer'?

Outlines/Python version information:

Head version of outlines. python3.10

Context for the issue:

No response

The text was updated successfully, but these errors were encountered:

rlouf · 2024-01-16T19:35:10Z

We reverted a recent change that modified the vLLM integration. Could you retry this on the current version of main?

sethkimmel3 · 2024-02-07T06:19:57Z

Hey @rlouf - this still appears to be broken. I'm getting AttributeError: 'LLM' object has no attribute 'tokenizer' error using the JSONLogitsProcessor. Using: vllm==0.3.0 and outlines==0.0.27.

Have you merged the above change into the most recent release?

sethkimmel3 · 2024-02-08T05:32:56Z

I got this to work by explicitly calling the adapt_tokenizer method found here. However, I'm pretty confused why this is necessary given that the JSONLogitsProcessor seems to instantiate the RegexLogitsProcessor.

I also needed to include the _patched_apply_logits_processors, as shown in this issue comment.

lapp0 · 2024-02-08T05:45:19Z

@sethkimmel3 could you please create an issue and provide a reproduction script? I think the docstring below is inaccurate, it actually seems to take an _AsyncLLMEngine

https://github.com/outlines-dev/outlines/blob/7fae436345e621a955e1e6ea610f74cf59f9466f/outlines/serve/vllm.py#L42-L53

sethkimmel3 · 2024-02-08T06:01:59Z

Done @lapp0 - #624

azgo14 added the bug label Jan 14, 2024

rlouf mentioned this issue Jan 14, 2024

Revert "Add CFG to vllm serving" #537

Merged

mory91 mentioned this issue Jan 15, 2024

Add CFG-guided generation to the vLLM integration #541

Closed

rlouf added the vLLM Things involving vLLM support label Jan 16, 2024

rlouf closed this as completed Jan 18, 2024

sethkimmel3 mentioned this issue Feb 8, 2024

Docstring of RegexLogitsProcessor is incorrect #624

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

VLLM batch inference example is broken #536

VLLM batch inference example is broken #536

azgo14 commented Jan 14, 2024

rlouf commented Jan 16, 2024

sethkimmel3 commented Feb 7, 2024 •

edited

Loading

sethkimmel3 commented Feb 8, 2024 •

edited

Loading

lapp0 commented Feb 8, 2024 •

edited

Loading

sethkimmel3 commented Feb 8, 2024

VLLM batch inference example is broken #536

VLLM batch inference example is broken #536

Comments

azgo14 commented Jan 14, 2024

Describe the issue as clearly as possible:

Steps/code to reproduce the bug:

Expected result:

Error message:

Outlines/Python version information:

Context for the issue:

rlouf commented Jan 16, 2024

sethkimmel3 commented Feb 7, 2024 • edited Loading

sethkimmel3 commented Feb 8, 2024 • edited Loading

lapp0 commented Feb 8, 2024 • edited Loading

sethkimmel3 commented Feb 8, 2024

sethkimmel3 commented Feb 7, 2024 •

edited

Loading

sethkimmel3 commented Feb 8, 2024 •

edited

Loading

lapp0 commented Feb 8, 2024 •

edited

Loading