[GuideLLM Refactor] entrypoints and working state (base to create PRs off of til merged into refactor base) #358

markurtz · 2025-09-19T13:10:08Z

Summary

Refactor of the GuideLLM command-line interface, streamlining the benchmark command structure while adding new mock server functionality and performance optimization features and adding in any missing fixes in other PRs to stabilize the refactor to a working state.

Details

CLI Interface Overhaul:
- Removed legacy -scenario option in favor of direct parameter specification
- Reorganized CLI options with clear grouping (Backend, Data, Output, Aggregators, Constraints)
- Added parameter aliases for backward compatibility (e.g., -rate-type → -profile)
- Simplified option defaults by removing scenario-based defaults
- Added comprehensive docstrings and help text for all commands and options
New Mock Server Command:
- Added guidellm mock-server command with full OpenAI/vLLM API compatibility
- Configurable latency characteristics (request latency, TTFT, ITL, output tokens)
- Support for both streaming and non-streaming endpoints
- Comprehensive server configuration options (host, port, workers, model name)
Performance Optimization Features:
- Added new perf optional dependency group with orjson, msgpack, msgspec, uvloop
- Integrated uvloop for enhanced async performance when available
- Optimized event loop policy selection based on availability
Internal Architecture Improvements:
- Updated import paths (guidellm.backend → guidellm.backends, guidellm.scheduler.strategy → guidellm.scheduler)
- Replaced scenario-based benchmarking with direct benchmark_generative_text function calls
- Enhanced error handling and parameter validation
- Simplified logging format for better readability
Enhanced Output and Configuration:
- Added support for multiple output formats with -output-formats option
- Improved output path handling for files vs directories
- Added new constraint options (-max-errors, -max-error-rate, -max-global-error-rate)
- Enhanced warmup/cooldown specification with flexible numeric/percentage options
Code Quality Improvements:
- Comprehensive type annotations throughout the codebase
- Detailed docstrings following Google/NumPy style conventions
- Consistent parameter naming and organization
- Removed deprecated version option from main CLI group

Test Plan

Tests for entrypoints to be added later

Related Issues

Part of the larger scheduler refactor initiative

"I certify that all code in this PR is my own, except as noted below."

Use of AI

Includes AI-assisted code completion
Includes code generated by an AI application
Includes AI-generated tests (NOTE: AI written tests should have a docstring that includes ## WRITTEN BY AI ##)

Signed-off-by: Mark Kurtz <[email protected]>

Copilot

Pull Request Overview

This PR refactors the GuideLLM command-line interface to streamline the benchmark command structure while adding mock server functionality and performance optimization features. The changes modernize the CLI interface by removing legacy scenario-based configuration in favor of direct parameter specification and introduce new capabilities for testing and development.

Key changes:

Replaced scenario-based CLI configuration with direct parameter specification using reorganized option groups
Added new mock server command with configurable OpenAI/vLLM API compatibility
Integrated performance optimizations through optional uvloop support and new perf dependency group

Reviewed Changes

Copilot reviewed 19 out of 21 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
src/guidellm/main.py	Complete CLI overhaul with new parameter structure, mock server command, and uvloop integration
src/guidellm/settings.py	Updated multiprocessing and scheduler settings with new configuration options
tests/unit/test_cli.py	Removed legacy CLI version flag tests
tests/unit/conftest.py	Removed old mock fixtures and test utilities
tests/unit/mock_*	Updated mock implementations for new architecture
tests/integration/scheduler/	Added integration tests for scheduler components
src/guidellm/utils/typing.py	New utility for extracting literal values from type aliases
pyproject.toml	Added perf optional dependency group

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

Copilot · 2025-09-19T13:11:31Z

src/guidellm/__main__.py

+] = list(get_literal_vals(Union[ProfileType, StrategyType]))
+
+
+def decode_escaped_str(_ctx, _param, value):


The decode_escaped_str function is defined but never used in this file. Consider removing it or moving it to a utilities module if it's needed elsewhere.

Copilot · 2025-09-19T13:11:31Z

src/guidellm/__main__.py

-    type=str,
-    help="The target path for the backend to run benchmarks against. For example, http://localhost:8000",
+    "--random-seed",
+    default=GenerativeTextScenario.get_default("random_seed"),


This line references GenerativeTextScenario.get_default() but the import for GenerativeTextScenario is incomplete - only the class is imported but not its methods. This will likely cause a runtime error.

Copilot · 2025-09-19T13:11:31Z

src/guidellm/__main__.py

+    "--cooldown-percent",  # legacy alias
+    "cooldown",
    type=float,
    default=GenerativeTextScenario.get_default("cooldown_percent"),


Same issue as with random_seed - this references GenerativeTextScenario.get_default() which may not be available due to incomplete import.

tests/unit/mock_backend.py

Copilot · 2025-09-19T13:11:32Z

src/guidellm/settings.py

    request_http2: bool = True

    # Scheduler settings
+    mp_context_type: Literal["spawn", "fork", "forkserver"] | None = "fork"


[nitpick] Using 'fork' as the default multiprocessing context can cause issues in some environments (especially macOS with certain Python versions). Consider using 'spawn' as a safer default or making it platform-dependent.

Suggested change

mp_context_type: Literal["spawn", "fork", "forkserver"] | None = "fork"

mp_context_type: Literal["spawn", "fork", "forkserver"] | None = "spawn"

markurtz added 2 commits September 19, 2025 13:01

Any missing changes / working state for refactor

a9a082a

Signed-off-by: Mark Kurtz <[email protected]>

add in the perf extras

6d0d4c2

Signed-off-by: Mark Kurtz <[email protected]>

markurtz requested review from DaltheCow, sjmonson, markVaykhansky, jaredoconnell, AlonKellner-RedHat and Copilot September 19, 2025 13:10

Copilot AI reviewed Sep 19, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[GuideLLM Refactor] entrypoints and working state (base to create PRs off of til merged into refactor base) #358

[GuideLLM Refactor] entrypoints and working state (base to create PRs off of til merged into refactor base) #358

Uh oh!

markurtz commented Sep 19, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Sep 19, 2025

Uh oh!

Copilot AI Sep 19, 2025

Uh oh!

Copilot AI Sep 19, 2025

Uh oh!

Uh oh!

Copilot AI Sep 19, 2025

Uh oh!

Uh oh!

		] = list(get_literal_vals(Union[ProfileType, StrategyType]))


		def decode_escaped_str(_ctx, _param, value):

	mp_context_type: Literal["spawn", "fork", "forkserver"] \| None = "fork"
	mp_context_type: Literal["spawn", "fork", "forkserver"] \| None = "spawn"

[GuideLLM Refactor] entrypoints and working state (base to create PRs off of til merged into refactor base) #358

Are you sure you want to change the base?

[GuideLLM Refactor] entrypoints and working state (base to create PRs off of til merged into refactor base) #358

Uh oh!

Conversation

markurtz commented Sep 19, 2025

Summary

Details

Test Plan

Related Issues

Use of AI

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Sep 19, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Sep 19, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Sep 19, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI Sep 19, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!