-
-
Notifications
You must be signed in to change notification settings - Fork 5.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CI] Turn on basic correctness tests for V1 #10864
Conversation
Signed-off-by: Tyler Michael Smith <[email protected]>
👋 Hi! Thank you for contributing to the vLLM project. Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can do one of these:
🚀 |
Signed-off-by: Tyler Michael Smith <[email protected]>
Signed-off-by: Tyler Michael Smith <[email protected]>
Signed-off-by: Tyler Michael Smith <[email protected]>
@tlrmchlsmth do we still have any issues on merging this PR? |
@WoosukKwon I tried it again after merging #9856 but it's still red. Maybe there are issues with the UniprocExecutor that can be resolved one way or another ✂️ |
vllm/vllm/v1/worker/gpu_model_runner.py Line 704 in c2d1b07
@tlrmchlsmth logits = self.model.compute_logits(hidden_states, None) comsume much more memory than V0, that lead to the OOM. But I don't know how to figure it out.
|
Hey, I'm pretty sure the failure was due to the engine cleaning up improperly previously, but curious what you were seeing before? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's get this enabled
Signed-off-by: Fred Reiss <[email protected]>
Ran into some OOM issues in #9856. Turning on this test to see if the problems happen without the TP changes.