-
-
Notifications
You must be signed in to change notification settings - Fork 5.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[V1] Fix torch profiling for offline inference #11125
Conversation
Signed-off-by: Roger Wang <[email protected]>
👋 Hi! Thank you for contributing to the vLLM project. Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can do one of these:
🚀 |
InprocClient
Signed-off-by: Roger Wang <[email protected]>
Signed-off-by: Roger Wang <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for fixing this!
@@ -15,19 +16,25 @@ | |||
# Create a sampling params object. | |||
sampling_params = SamplingParams(temperature=0.8, top_p=0.95) | |||
|
|||
# Create an LLM. | |||
llm = LLM(model="facebook/opt-125m", tensor_parallel_size=1) | |||
if __name__ == "__main__": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that you shouldn't need the if __name__ == "__main__":
once #11074 lands
Signed-off-by: Roger Wang <[email protected]>
Signed-off-by: Roger Wang <[email protected]>
Tested with
VLLM_USE_V1=1 python3 examples/offline_inference_with_profiler.py
. Without this fix, running the same script will error out withEdit:
Also fixed for
VLLM_USE_V1=1 VLLM_ENABLE_V1_MULTIPROCESSING=1 python3 examples/offline_inference_with_profiler.py
. Two notable changes are:if __name__ == "__main__":
to make it work with MP/SyncMPClient
.