Skip to content

Commit

Permalink
Save the vllm tokenizer adapted state
Browse files Browse the repository at this point in the history
  • Loading branch information
mory91 committed Jan 25, 2024
1 parent e16d986 commit 9fea504
Show file tree
Hide file tree
Showing 2 changed files with 12 additions and 3 deletions.
2 changes: 1 addition & 1 deletion docs/reference/vllm.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ curl http://127.0.0.1:8000/generate \
}'
```

To generate a string that matches the grammar `<grammar>`:
To generate a string that matches a given grammar `<grammar>`:

```bash
curl http://127.0.0.1:8000/generate \
Expand Down
13 changes: 11 additions & 2 deletions outlines/serve/vllm.py
Original file line number Diff line number Diff line change
Expand Up @@ -44,10 +44,16 @@ def _adapt_tokenizer(tokenizer):
"""Adapt vLLM's tokenizer to use to compile the FSM.
The API of Outlines tokenizers is slightly different to that of
`transformers`. In addition we need to handle the missing spaces to
Llama's tokenizer to be able to compile FSMs for this model.
`transformers`. The decoder of outlines, returns a list whereas
the decode of vLLM returns an str. To sync the vLLM decoder with
outlines internal api, the decoder should be adapted. In addition
we need to handle the missing spaces to Llama's tokenizer to be
able to compile FSMs for this model.
"""
if getattr(tokenizer, "_outlines_adapted", False):
return tokenizer

tokenizer.vocabulary = tokenizer.get_vocab()
tokenizer.special_tokens = set(tokenizer.all_special_tokens)

Expand All @@ -65,13 +71,16 @@ def convert_token_to_string(token: str) -> str:
def change_decoder(
decoder: Callable[[List[int]], str]
) -> Callable[[List[int]], List[str]]:
"""Sync vLLM's decoder with the outlines expectations by returning list"""

def new_decoder(inp_tokens: List[int]) -> List[str]:
return [decoder(inp_tokens)]

return new_decoder

tokenizer.convert_token_to_string = convert_token_to_string
tokenizer.decode = change_decoder(tokenizer.decode)
setattr(tokenizer, "_outlines_adapted", True)

return tokenizer

Expand Down

0 comments on commit 9fea504

Please sign in to comment.