-
Notifications
You must be signed in to change notification settings - Fork 82
[GuideLLM Refactor] entrypoints and working state (base to create PRs off of til merged into refactor base) #358
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: features/refactor/mock-server
Are you sure you want to change the base?
Conversation
Signed-off-by: Mark Kurtz <[email protected]>
Signed-off-by: Mark Kurtz <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR refactors the GuideLLM command-line interface to streamline the benchmark command structure while adding mock server functionality and performance optimization features. The changes modernize the CLI interface by removing legacy scenario-based configuration in favor of direct parameter specification and introduce new capabilities for testing and development.
Key changes:
- Replaced scenario-based CLI configuration with direct parameter specification using reorganized option groups
- Added new mock server command with configurable OpenAI/vLLM API compatibility
- Integrated performance optimizations through optional uvloop support and new perf dependency group
Reviewed Changes
Copilot reviewed 19 out of 21 changed files in this pull request and generated 5 comments.
Show a summary per file
File | Description |
---|---|
src/guidellm/main.py | Complete CLI overhaul with new parameter structure, mock server command, and uvloop integration |
src/guidellm/settings.py | Updated multiprocessing and scheduler settings with new configuration options |
tests/unit/test_cli.py | Removed legacy CLI version flag tests |
tests/unit/conftest.py | Removed old mock fixtures and test utilities |
tests/unit/mock_* | Updated mock implementations for new architecture |
tests/integration/scheduler/ | Added integration tests for scheduler components |
src/guidellm/utils/typing.py | New utility for extracting literal values from type aliases |
pyproject.toml | Added perf optional dependency group |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
] = list(get_literal_vals(Union[ProfileType, StrategyType])) | ||
|
||
|
||
def decode_escaped_str(_ctx, _param, value): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The decode_escaped_str function is defined but never used in this file. Consider removing it or moving it to a utilities module if it's needed elsewhere.
Copilot uses AI. Check for mistakes.
type=str, | ||
help="The target path for the backend to run benchmarks against. For example, http://localhost:8000", | ||
"--random-seed", | ||
default=GenerativeTextScenario.get_default("random_seed"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line references GenerativeTextScenario.get_default() but the import for GenerativeTextScenario is incomplete - only the class is imported but not its methods. This will likely cause a runtime error.
Copilot uses AI. Check for mistakes.
"--cooldown-percent", # legacy alias | ||
"cooldown", | ||
type=float, | ||
default=GenerativeTextScenario.get_default("cooldown_percent"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same issue as with random_seed - this references GenerativeTextScenario.get_default() which may not be available due to incomplete import.
Copilot uses AI. Check for mistakes.
request_http2: bool = True | ||
|
||
# Scheduler settings | ||
mp_context_type: Literal["spawn", "fork", "forkserver"] | None = "fork" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] Using 'fork' as the default multiprocessing context can cause issues in some environments (especially macOS with certain Python versions). Consider using 'spawn' as a safer default or making it platform-dependent.
mp_context_type: Literal["spawn", "fork", "forkserver"] | None = "fork" | |
mp_context_type: Literal["spawn", "fork", "forkserver"] | None = "spawn" |
Copilot uses AI. Check for mistakes.
Summary
Refactor of the GuideLLM command-line interface, streamlining the benchmark command structure while adding new mock server functionality and performance optimization features and adding in any missing fixes in other PRs to stabilize the refactor to a working state.
Details
-scenario
option in favor of direct parameter specification-rate-type
→-profile
)perf
optional dependency group withorjson
,msgpack
,msgspec
, uvloop-output-formats
option-max-errors
,-max-error-rate
,-max-global-error-rate
)Test Plan
Related Issues
Use of AI
## WRITTEN BY AI ##
)