-
-
Notifications
You must be signed in to change notification settings - Fork 5.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Frontend] re-enable multi-modality input in the new beam search implementation #9427
[Frontend] re-enable multi-modality input in the new beam search implementation #9427
Conversation
👋 Hi! Thank you for contributing to the vLLM project. Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can do one of these:
🚀 |
Added your email to our buildkite org. |
vllm/engine/protocol.py
Outdated
tokenizer = await self.get_tokenizer() | ||
self.input_preprocessor = InputPreprocessor(model_config, | ||
self.tokenizer) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Btw why is this function defined inside a protocol? Perhaps we should move this to LLMEngine
? Then we can make use of the existing input_preprocessor
defined there. @youkaichao
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I made a mistake here, I just pushed another commit to change it from self.input_preprocessor
and self.tokenizer
to input_preprocessor
and tokenizer
.
By the way, in the current release v0.6.3, this function is inside AsyncLLMEngine
and it has been moved here in a recent pr: 9296
Hi @DarkLight1337, thank you for the review! I noticed the changes in PR #9473 and will merge the latest code once that PR is merged. Regarding the conflicts between the two PRs:
@njhill, could you please review these changes? |
Thanks @FerdinandZhong, I'll review tomorrow (Friday US time) |
…_search_multi_modality
…_search_multi_modality
…_search_multi_modality
Thanks @njhill. May I know if changes look okay to you? |
@FerdinandZhong sorry for the delay, bit behind with things. The changes look good to me thanks. The InputProcessor change makes sense. Re the logprobs, it's a good point that the number returned will be based on the beam width rather than how many are actually requested. I think we can improve this to request the max of these two and truncate as needed. But no need to change that for this PR. I think we can improve the impl quite a bit more overall in some follow-on updates including:
|
Hi @njhill, thank you for your comment! I agree that taking the maximum of the |
I'm surprised there are so many efforts for adding various features for beam-search ... We will work for implementing beam search in another way so that all features for normal generation just works. |
Hi @youkaichao, thank you for the feedback. I understand your concerns, and I agree that, in the long run, beam search can be properly designed and implemented. In the short term, I’m happy to continue providing feedback and contributing commits from a user’s perspective. In the meantime, @DarkLight1337, could we consider merging this PR first, as it addresses the fix for #9577? |
@njhill do we need to rebuild this? |
…_search_multi_modality
…_search_multi_modality Signed-off-by: Qishuai [email protected]
Hi @DarkLight1337 , I’ve merged the latest main branch and rerun the tests from my end. May I ask for your advice on the next steps to do with this PR for merging? Additionally, can I check with you if rebase is needed to add "sign-off" for each commit? Thank you. |
Sorry for missing this! You don't have to worry about signing-off for this PR as we can manually set that to pass. |
I have enabled auto-merge which should run all the tests and merge if they pass. |
@DarkLight1337 got it, thanks! |
…ementation (vllm-project#9427) Signed-off-by: Qishuai [email protected] Signed-off-by: Randall Smith <[email protected]>
…ementation (vllm-project#9427) Signed-off-by: Qishuai [email protected] Signed-off-by: NickLucche <[email protected]>
…ementation (vllm-project#9427) Signed-off-by: Qishuai [email protected] Signed-off-by: NickLucche <[email protected]>
…ementation (vllm-project#9427) Signed-off-by: Qishuai [email protected]
…ementation (vllm-project#9427) Signed-off-by: Qishuai [email protected] Signed-off-by: Linkun Chen <[email protected]>
…ementation (vllm-project#9427) Signed-off-by: Qishuai [email protected] Signed-off-by: Loc Huynh <[email protected]>
…ementation (vllm-project#9427) Signed-off-by: Qishuai [email protected] Signed-off-by: Sumit Dubey <[email protected]>
…ementation (vllm-project#9427) Signed-off-by: Qishuai [email protected]
…ementation (vllm-project#9427) Signed-off-by: Qishuai [email protected] Signed-off-by: Maxime Fournioux <[email protected]>
…ementation (vllm-project#9427) Signed-off-by: Qishuai [email protected] Signed-off-by: Tyler Michael Smith <[email protected]>
…ementation (vllm-project#9427) Signed-off-by: Qishuai [email protected]
FIX #9577
Changes in this PR:
This PR introduces the following changes based on the updated beam search implementation:
Support for multi-modality input has been re-enabled for beam search with OpenAI-compatible endpoints.
ChatCompletionRequest
:Added additional validation to disable l
ogprobs
whenuse_beam_search=True
. Since the beam search selects results based oncumulative logprobs
and determines step logprobs bybeam_width
, it ignores thetop_logprobs
andlogprobs
parameters passed in with the request.Unit Test
Added two additional test cases in
tests/entrypoints/openai/test_vision.py
.Manual Testing
The following command was used to launch the server for manual testing:
vllm serve microsoft/Phi-3.5-vision-instruct --api-key token-abc123 --trust-remote-code --max-model-len 4096 --limit-mm-per-prompt image=2
Client script used to test the changes:
Verified the functionality of multi-image input handling and correct response generation using beam search with the above manual tests.