[GuideLLM Refactor] scheduler package updates, rewrites, and tests expansion #354
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
Introduces a comprehensive constraints system and enhanced timing control for the scheduler refactor. The implementation moves from hardcoded execution limits to a flexible, composable constraint system that enables sophisticated benchmark stopping criteria. Additionally, request timing calculations are moved from precalculated to per-request basis, enabling dynamic rate adjustments and better distributed coordination.
Details
constraints.py
): Implements Protocol-based constraint architecture with support for request limits, duration limits, error thresholds, and sliding window error ratesMaxNumberConstraint
: Limits execution based on request countMaxDurationConstraint
: Limits execution based on time durationMaxErrorsConstraint
: Limits execution based on absolute error countMaxErrorRateConstraint
: Limits execution based on sliding window error rateMaxGlobalErrorRateConstraint
: Limits execution based on global error rateConstraintsInitializerFactory
: Registry system for constraint creation and serializationobjects.py
): Replacedresult.py
and expanded capabilitiesBackendInterface
protocol for type-safe backend integrationScheduledRequestInfo
with comprehensive timing and status trackingSchedulerState
for distributed state coordinationSchedulerUpdateAction
for constraint-based control signalsstrategy.py
): Introduced request timing abstractionsScheduledRequestTimings
base class for timing implementationsLastCompletionRequestTimings
: For synchronous and concurrent strategiesNoDelayRequestTimings
: For maximum throughput strategiesConstantRateRequestTimings
: For fixed-rate schedulingPoissonRateRequestTimings
: For stochastic request patternsenvironment.py
): Coordination layer for distributed executionEnvironment
protocol for distributed synchronizationNonDistributedEnvironment
implementation for single-node executionworker.py
,worker_group.py
): Distributed request processing infrastructureTest Plan
Related Issues
Use of AI
## WRITTEN BY AI ##
)