Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[UX][Serve] More comprehensive target_qps_per_replica check #3361

Merged
merged 3 commits into from
Mar 25, 2024

Conversation

cblmemo
Copy link
Collaborator

@cblmemo cblmemo commented Mar 24, 2024

Fixes #3359

Tested (run the relevant ones):

  • Code formatting: bash format.sh
  • Any manual or new tests for this PR (please specify below)
  1. Autoscaling enabled
service:
  readiness_probe:
    path: /health
    initial_delay_seconds: 20
  replica_policy:
    min_replicas: 1
    max_replicas: 2
    target_qps_per_replica: 1
$ sky serve up -n http @temp/svc.yaml
Service from YAML spec: @temp/svc.yaml
Service Spec:
Readiness probe method:           GET /health
Readiness initial delay seconds:  20
Replica autoscaling policy:       Autoscaling from 1 to 2 replicas with Target 1 QPS/replica
Spot Policy:                      No spot policy
  1. Error (target_qps_per_replica not set)
service:
  readiness_probe:
    path: /health
    initial_delay_seconds: 20
  replica_policy:
    min_replicas: 1
    max_replicas: 2
$ sky serve up -n http @temp/svc.yaml
Service from YAML spec: @temp/svc.yaml
ValueError: Detect different min_replicas and max_replicas while target_qps_per_replica is not set. To enable autoscaling, please set target_qps_per_replica.
  1. Fixed replicas when min==max
service:
  readiness_probe:
    path: /health
    initial_delay_seconds: 20
  replica_policy:
    min_replicas: 1
    max_replicas: 1
    target_qps_per_replica: 1
$ sky serve up -n http @temp/svc.yaml
Service from YAML spec: @temp/svc.yaml
Service Spec:
Readiness probe method:           GET /health
Readiness initial delay seconds:  20
Replica autoscaling policy:       Fixed 1 replica
Spot Policy:                      No spot policy
  1. Autoscaling disabled
service:
  readiness_probe:
    path: /health
    initial_delay_seconds: 20
  replica_policy:
    min_replicas: 1
    max_replicas: 1
$ sky serve up -n http @temp/svc.yaml
Service from YAML spec: @temp/svc.yaml
Service Spec:
Readiness probe method:           GET /health
Readiness initial delay seconds:  20
Replica autoscaling policy:       Fixed 1 replica
Spot Policy:                      No spot policy
  • All smoke tests: pytest tests/test_smoke.py
  • Relevant individual smoke tests: pytest tests/test_smoke.py::test_fill_in_the_name
  • Backward compatibility tests: bash tests/backward_comaptibility_tests.sh

Copy link
Member

@concretevitamin concretevitamin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @cblmemo!

sky/serve/service_spec.py Outdated Show resolved Hide resolved
sky/serve/service_spec.py Outdated Show resolved Hide resolved
sky/serve/service_spec.py Outdated Show resolved Hide resolved
@cblmemo cblmemo merged commit 90eeb00 into master Mar 25, 2024
20 checks passed
@cblmemo cblmemo deleted the serve-autoscaling-ux branch March 25, 2024 05:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Serve] Service update scales to zero unexpectedly?
2 participants