-
-
Notifications
You must be signed in to change notification settings - Fork 5.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Usage]: Persistent Errors with vllm serve on Neuron Device: Model architectures ['LlamaForCausalLM'] failed to be inspected. #10932
Comments
Can you show the full logs? Not just the final stack trace. |
@DarkLight1337 here it is, sorry:
|
This looks like a problem with custom ops. @youkaichao might be able to help. |
I think the neuro torch version is too low. |
our ci works, though, see https://buildkite.com/vllm/ci/builds/10346#01939f7d-ef3c-42e2-bfce-c759f713b3d8 . |
@youkaichao I think my neuro torch version is the latest: |
yup, neuron still requires torch==2.1 I think. I was able to reproduce locally, I'm not sure how it's possible that this works on the CI, here is a pull request with a fix: |
Your current environment
Hello vLLM Development Team,
I am encountering persistent issues when trying to run the
vllm serve
command for themeta-llama/Llama-3.2-1B
model on an AWS EC2 inf2 instance with the Neuron AMI. Despite following all the recommended installation and upgrade steps, and adjusting the numpy versions as per the guidelines, the issue persists.I already referred the issues I could find such as:
#9624
#9713
#9624
Here is the way I installed the vllm under the instruction guideline through:
I already tried to reinstall or upgrade the vllm under the instruction above many times, also tried to set the numpy versions. Still I cannot solve the problem when I tried to run the
vllm serve meta-llama/Llama-3.2-1B --device neuron --tensor-parallel-size 2 --block-size 8 --max-model-len 4096 --max-num-seqs 32
It constantly shows the error here:
ValueError: Model architectures ['LlamaForCausalLM'] failed to be inspected. Please check the logs for more details.
Here is my environment:
Here is the error log:
I am reaching out to ask for your expert advice on how to proceed or if there are any additional steps you could suggest to help resolve this issue. Any assistance you can provide would be greatly appreciated.
How would you like to use vllm
No response
Before submitting a new issue...
The text was updated successfully, but these errors were encountered: