Now I use outlines+vllm and it takes 10 seconds to generate a JSON. This speed is not feasible in my current business. Is there any accelerated solution, such as a paper, plan or PR? #1161

lwdnxu · 2024-09-18T01:38:40Z

lwdnxu
Sep 18, 2024

Now I use outlines+vllm and it takes 10 seconds to generate a JSON. This speed is not feasible in my current business. Is there any accelerated solution, such as a paper, plan or PR?

cpfiffer · 2024-09-18T05:18:50Z

cpfiffer
Sep 18, 2024
Collaborator

Speed improvements are always being worked on, but it may also be worth discussing your general setup. What models and schemas do you tend to work with? There might be some places we can suggest performance improvements.

0 replies

lwdnxu · 2024-09-18T06:25:15Z

lwdnxu
Sep 18, 2024
Author

model： qwen2-7b
GPU: 4090
I want to generate json like this :{
"dataSource": "1002",
"column": "200-279",
"value": [
"360"
],
"mark": "16",
"unit": null
}
Number of tokens per request: 4166
Using vllm+ outlines, the speed is about 10+ seconds
Using vllm, the speed is about 1.x seconds

I would like to ask if there is an optimized solution? And what is the maximum number of seconds you can achieve in this case?

1 reply

lapp0 Sep 18, 2024

@lwdnxu Each new schema requires an index to be compiled, which has some overhead. Once generation starts, overhead is ~1ms/token. This index only needs to be compiled once per unique schema.

Are you using a new schema each time, or are you seeing slowness re-using the same schema?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Now I use outlines+vllm and it takes 10 seconds to generate a JSON. This speed is not feasible in my current business. Is there any accelerated solution, such as a paper, plan or PR? #1161

{{title}}

Replies: 2 comments 1 reply

{{title}}

{{title}}

{{title}}

Select a reply

Now I use outlines+vllm and it takes 10 seconds to generate a JSON. This speed is not feasible in my current business. Is there any accelerated solution, such as a paper, plan or PR? #1161

lwdnxu Sep 18, 2024

Replies: 2 comments · 1 reply

cpfiffer Sep 18, 2024 Collaborator

lwdnxu Sep 18, 2024 Author

lapp0 Sep 18, 2024

lwdnxu
Sep 18, 2024

Replies: 2 comments 1 reply

cpfiffer
Sep 18, 2024
Collaborator

lwdnxu
Sep 18, 2024
Author