-
-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: speculative decoding dies: IndexError: index 0 is out of bounds for dimension 0 with size 0 #7047
Comments
Maybe you can change your speculative model or set the |
Is anyone fixing this bug? |
I'm happy to try other options. It was working well for someone else, but not for me on the phi-3-mini-128k model. Failed instantly. I'll probably wait until this bug is fixed before trying again. The hope is that for structured output, others are getting quite good speed-up. i.e. for guided_json and JSON output, about 5x improvement for a 7b model. Sounds great, but just crashes for me. |
Did you try adding |
FYI, you can build from the source code of the main branch. I guess the container you are using is built with vllm version v0.5.3 or v0.5.3.post1. (#6698) has fixed this bug. Alternatively, you can wait for the release of v0.5.4, which should not cause the crash again. |
0.5.4 seems to fix the issue. |
Your current environment
🐛 Describe the bug
What very first message to the model of "Who are you?" I got "I" and then died.
The text was updated successfully, but these errors were encountered: