Closed
Description
Your current environment
The output of `python collect_env.py`
Your output of `python collect_env.py` here
Model Input Dumps
No response
🐛 Describe the bug
following the instruction #8700 (comment) , the reward model https://huggingface.co/nvidia/Llama-3.1-Nemotron-70B-Reward-HF is a LlamaForCausalLM model, so I serve it with vllm add parameter --task embedding.
when I send a request, it encounter en error:
INFO: "POST /v1/embeddings HTTP/1.1" 500 Internal Server Error
ERROR 11-19 17:55:16 engine.py:135] TypeError("object of type 'NoneType' has no len()")
and then the server terminated
the shell script:
curl http://host:port/v1/embeddings \
-H "Content-Type: application/json" \
-d '{
"model": "Llama-3.1-Nemotron-70B-Reward-HF",
"input": "Your text string goes here"
}'
or use python code same as https://github.com/vllm-project/vllm/blob/main/examples/openai_embedding_client.py
has the same error
Before submitting a new issue...
- Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.