Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Usage]: Sampling several sequences from OpenAI compatible server. #10852

Open
1 task done
Ignoramus0817 opened this issue Dec 3, 2024 · 4 comments
Open
1 task done
Labels
usage How to use vllm

Comments

@Ignoramus0817
Copy link

Ignoramus0817 commented Dec 3, 2024

Your current environment

I got an error when running collect_env.py
ImportError: cannot import name '__version_tuple__' from 'vllm'

Anyway, I'm using vllm 0.5.3post1.

How would you like to use vllm

I want to sample n independent samples in one call of chat completion api from LLaMA-3-70B-Instruct (served with OpenAI compatible server).

I added the sampling parameter "--n" in the api call, but got n identical responses whether seed was set. Besides, if I manually call the api several times, each time with a different seed, I will get different outputs. Is this the expected behavior? If yes, what does this sampling parameter "--n" do?

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.
@Ignoramus0817 Ignoramus0817 added the usage How to use vllm label Dec 3, 2024
@jikunshang
Copy link
Contributor

#10503 this model should be supported last week. Please install latest vllm and try again.

@Ignoramus0817
Copy link
Author

Ignoramus0817 commented Dec 4, 2024

#10503 this model should be supported last week. Please install latest vllm and try again.

This PR is about OLMo model, which is irrelevant to this issue. I guess you reply to the wrong one?

@jikunshang
Copy link
Contributor

#10503 this model should be supported last week. Please install latest vllm and try again.

This PR is about OLMo model, which is irrelevant to this issue. I guess you reply to the wrong one?

sorry, please ignore this, reply to wrong thread.

@jikunshang
Copy link
Contributor

n means How many chat completion choices to generate for each input message see, https://platform.openai.com/docs/api-reference/chat and https://github.com/vllm-project/vllm/blob/main/vllm/sampling_params.py#L99
how do you sample n independent samples? can you try with examples/openai_completion_client.py or examples/openai_chat_completion_client.py(need add n parameter). I tested both, when set n to 2, two outputs are different.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
usage How to use vllm
Projects
None yet
Development

No branches or pull requests

2 participants