-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bladellm] Support dispatch feature for BladeLLM #86
base: main
Are you sure you want to change the base?
Conversation
docs/Arguments.md
Outdated
@@ -32,8 +32,11 @@ usage: -m llumnix.entrypoints.vllm.api_server [-h] | |||
[--profiling-result-file-path PROFILING_RESULT_FILE_PATH] | |||
[--gpu-type GPU_TYPE] | |||
[--polling-interval POLLING_INTERVAL] | |||
[--migration-backend {gloo,nccl,rpc}] | |||
[--migration-backend {gloo,nccl,rpc,grpc,kvtransfer}] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to add explanations in helps and arguments.py for the args of blade.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
get
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And, it seems confusing to have both "rpc" and "grpc". Maybe change "rpc" to "rayrpc"?
|
||
assert args.migration_backend != 'gloo' or (args.migration_backend == 'gloo' \ | ||
and not args.disable_init_instance_by_manager and not args.disable_fixed_node_init_instance), \ | ||
("When using gloo as migration backend, " | ||
"do not set --disable-init-instance-by-manager and --disable-fixed-node-init-instance.") | ||
|
||
assert args.migration_backend not in ['kvtransfer'] or (args.migration_backend == 'kvtransfer' \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should add assert for the args only using by vllm/blade.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO, clarify features of vllm/blade in parser, arg_utils, etc...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
get
llumnix/queue/utils.py
Outdated
|
||
logger = init_logger(__name__) | ||
|
||
class AsyncPutQueueActor: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why put this actor here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
put it in backends/utils.py now
a3ff024
to
cdac2e3
Compare
refine fix loguru error fix kwarg error fix ray autoscale error fix fix
347378f
to
716cb02
Compare
|
|
docs/Arguments.md
Outdated
@@ -32,8 +32,11 @@ usage: -m llumnix.entrypoints.vllm.api_server [-h] | |||
[--profiling-result-file-path PROFILING_RESULT_FILE_PATH] | |||
[--gpu-type GPU_TYPE] | |||
[--polling-interval POLLING_INTERVAL] | |||
[--migration-backend {gloo,nccl,rpc}] | |||
[--migration-backend {gloo,nccl,rpc,grpc,kvtransfer}] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And, it seems confusing to have both "rpc" and "grpc". Maybe change "rpc" to "rayrpc"?
|
|
No description provided.