-
-
Notifications
You must be signed in to change notification settings - Fork 5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[platforms] enable platform plugins #11602
Conversation
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]>
👋 Hi! Thank you for contributing to the vLLM project. Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can do one of these:
🚀 |
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]>
- label: Plugin Tests (2 GPUs) # 40min | ||
working_dir: "/vllm-workspace/tests" | ||
num_gpus: 2 | ||
fast_check: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if they import current_platform too early, then this test will fail, and the error message will tell them which line to blame.
I'm OK with this. Looks clear enough
@@ -265,13 +264,13 @@ def prepare_model_input( | |||
""" | |||
raise NotImplementedError | |||
|
|||
@current_platform.inference_mode() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is removed directly. It's useless now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is moved to https://github.com/vllm-project/vllm/pull/11602/files#diff-1ca8953d4dab62fff08a34c8cf370d2a37aeb009526ab4fc5464a65e1d03b036R84 .
this line does nothing actually. no subclass calls this function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
vllm/platforms/__init__.py
Outdated
# lazy init current_platform so that plugins can import vllm.platforms | ||
# to inherit Platform without circular imports |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct me if I'm wrong, but isn't the main reason for lazy init is to allow platform plugins to be loaded before the built-in ones?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The circular import problem should already be solved by returning strings in the built-in plugins instead of direct import.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, you are right.
what this line of comment mean:
for third-party platform developers, they also need to import vllm.platforms
and inherit the base Platform
. therefore, we cannot resolve the current platform during import vllm.platforms
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can avoid this import by moving the code into a registry.py
, similar to that for models.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
even if we have vllm.platforms.registry
, when people import that module, vllm/platforms/__init__.py
will be executed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can also move current_platform
into vllm.platforms.registry
so it's not imported automatically when accessing other platform modules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
explained in 8f43a03 . let me know if it works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is much clearer now, thanks!
Signed-off-by: youkaichao <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really don't like the necessity of the lazy import (similar to CUDA re-initialization problem) but we can iterate on this later if needed...
I agree we should avoid lazy import if possible. But there are some historical code we need to take care of, many people unconsciously initialize cuda or trigger |
Sorry for the delay, Overall lgtm. Just one question. Maybe this pr lack of a func like |
We can add it later if it is necessary. Right now I don't see the necessity. For the oot platform related code, they always run on oot platforms; for vllm's main branch code, they don't need to consider oot platform code. I think vllm mainly exposes integration hooks to oot platforms. for example, when vllm's code use |
OK, thx! We will verify NPU as an oot device recently. If necessary, I will add it through a new PR. |
Signed-off-by: youkaichao <[email protected]>
f598a67
to
590f07a
Compare
failed tests are not related, merging. |
Signed-off-by: youkaichao <[email protected]>
Signed-off-by: youkaichao <[email protected]> Signed-off-by: xcnick <[email protected]>
refactor of #11222 , enable oot-registered platforms.