Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Serve] Service update scales to zero unexpectedly? #3359

Closed
concretevitamin opened this issue Mar 23, 2024 · 1 comment · Fixed by #3361
Closed

[Serve] Service update scales to zero unexpectedly? #3359

concretevitamin opened this issue Mar 23, 2024 · 1 comment · Fixed by #3361

Comments

@concretevitamin
Copy link
Member

Repro:

  # replicas: 1
  replica_policy:
    min_replicas: 0
    max_replicas: 1
  • sky serve update http examples/serve/http_server/task.yaml -y

Observed: Status immediately became

Services
NAME  VERSION  UPTIME  STATUS      REPLICAS  ENDPOINT
http  -        6m 48s  NO_REPLICA  0/1       34.239.227.235:30001

Service Replicas
SERVICE_NAME  ID  VERSION  IP              LAUNCHED    RESOURCES       STATUS         REGION
http          1   1        44.213.109.175  7 mins ago  1x AWS(vCPU=2)  SHUTTING_DOWN  us-east-1

After a while it became

Services
NAME  VERSION  UPTIME  STATUS      REPLICAS  ENDPOINT
http  -        8m 14s  NO_REPLICA  0/0       34.239.227.235:30001

Service Replicas
No existing replicas.

The curl's are still ongoing and fail to wake up the service.

Both problems seem unexpected. Since there's constant traffic hitting the service, scale-to-zero shouldn't have kicked in. Also, why is the service not woken up?

commit 82c50f5

cc @cblmemo @Michaelvll

@cblmemo
Copy link
Collaborator

cblmemo commented Mar 24, 2024

Oh I suppose that is because you didn't set target_qps_per_replica... Reference here

def _cal_target_num_replicas_based_on_qps(self) -> int:
# Recalculate target_num_replicas based on QPS.
# Reclip self.target_num_replicas with new min and max replicas.
if self.target_qps_per_replica is None:
return self.min_replicas

The UX in current master is problematic, i.e. showing Autoscaling from 0 to 1 replica when autoscaling is not enabled. We need to prompt the user if target_qps_per_replica is not set, the autoscaling will not enabled. Just submitted a PR to fix this #3361

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants