-
Notifications
You must be signed in to change notification settings - Fork 451
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: support guided decoding for vllm async engine #2391
base: main
Are you sure you want to change the base?
Conversation
Which version is required? |
latest version after 0.6.2, waiting for vllm to release new version |
2968700
to
cd0812a
Compare
vllm has release v0.6.3, is this PR ready to work? |
I will do the test.
…________________________________
寄件者: Xuye Qin ***@***.***>
寄件日期: 星期四, 10月 17, 2024 5:38:13 下午
收件者: xorbitsai/inference ***@***.***>
副本: wxiwnd ***@***.***>; Author ***@***.***>
主旨: Re: [xorbitsai/inference] feat: support guided decoding for vllm async engine (PR #2391)
vllm has release v0.6.3, is this PR ready to work?
—
Reply to this email directly, view it on GitHub<#2391 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AJSDNXWKTHLIQE35VTQLT23Z36AQDAVCNFSM6AAAAABPJCVJQOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMJZGA2DOMZWGE>.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
4d9e044
to
852c86c
Compare
Works on my machine now
|
Can you confirm there is no exception if the vllm is an old version? |
823887f
to
df849b1
Compare
It now works properly even if vllm version < 0.6.3 |
b8025ea
to
eb816c1
Compare
d1d41bf
to
9d13391
Compare
Signed-off-by: wxiwnd <[email protected]>
Signed-off-by: wxiwnd <[email protected]>
Signed-off-by: wxiwnd <[email protected]>
9d13391
to
60e3e3e
Compare
Signed-off-by: wxiwnd <[email protected]>
Signed-off-by: wxiwnd <[email protected]>
This feature has been tested on my machine and appears to be functioning properly. @qinxuye |
Support Guided Decoding for vllm async engine
waiting for vllm release, a version bump is needed.
#1562
vllm-project/vllm#8252