-
Notifications
You must be signed in to change notification settings - Fork 553
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Serve] sky serve up doesn't fetch existing clusters #3122
Comments
Also, the command result suggests to me to specify an existing cluster, but I didn't find any valid option to do it. Anyone knows how to fix it? |
Thanks for the report @mrPsycox.
@cblmemo: Can we create an issue to fix the UX problem of not displaying spot/serve controller in the confirmation prompts of |
Thanks for the help. I solved the issue using |
Good point! Just filed an issue #3138 . |
Running this command:
sky serve up skypilot-dev.yaml
With this yaml file:
sky have an unexpected behavior. After correctly selecting the cloud (as you see below, my Kubernetes cluster on prem) serves goes to launch a new instance on AWS (without asking or telling me).
Here below the command result:
Service from YAML spec: skypilot-dev.yaml Service Spec: Readiness probe method: GET /v1/models Readiness initial delay seconds: 1200 Replica autoscaling policy: Fixed 1 replica Each replica will use the following resources (estimated): I 02-08 13:59:49 optimizer.py:694] == Optimizer == I 02-08 13:59:49 optimizer.py:717] Estimated cost: $0.0 / hour I 02-08 13:59:49 optimizer.py:717] I 02-08 13:59:49 optimizer.py:840] Considered resources (1 node): I 02-08 13:59:49 optimizer.py:910] ---------------------------------------------------------------------------------------------------- I 02-08 13:59:49 optimizer.py:910] CLOUD INSTANCE vCPUs Mem(GB) ACCELERATORS REGION/ZONE COST ($) CHOSEN I 02-08 13:59:49 optimizer.py:910] ---------------------------------------------------------------------------------------------------- I 02-08 13:59:49 optimizer.py:910] Kubernetes 4CPU--8GB--4V100 4 8 V100:4 kubernetes 0.00 ✔ I 02-08 13:59:49 optimizer.py:910] ---------------------------------------------------------------------------------------------------- I 02-08 13:59:49 optimizer.py:910] Launching a new service 'sky-service-12e9'. Proceed? [Y/n]: Y Launching controller for 'sky-service-12e9'... W 02-08 13:59:55 instance.py:641] Expected security group sky-sg-sky-serve-controller-fcd54c3d-fcd5 not found. W 02-08 13:59:55 instance.py:764] Find security group failed. Skip cleanup security group. I 02-08 13:59:55 cloud_vm_ray_backend.py:4370] The cluster 'sky-serve-controller-fcd54c3d' (status: INIT) was not found on the cloud: it may be autodowned, manually terminated, or its launch never succeeded. Provisioning a new cluster by using the same resources as its original launch. I 02-08 13:59:56 cloud_vm_ray_backend.py:4389] Creating a new cluster: 'sky-serve-controller-fcd54c3d' [1x AWS(m6i.xlarge, disk_size=200, ports=['30001-30100'])]. I 02-08 13:59:56 cloud_vm_ray_backend.py:4389] Tip: to reuse an existing cluster, specify --cluster (-c). Run
sky statusto see existing clusters. I 02-08 13:59:56 cloud_vm_ray_backend.py:1386] To view detailed progress: tail -n100 -f /Users/mrpsycox/sky_logs/sky-2024-02-08-13-59-53-215853/provision.log I 02-08 13:59:57 provisioner.py:79] Launching on AWS us-east-1 (us-east-1a,us-east-1b,us-east-1c,us-east-1d,us-east-1f)
The text was updated successfully, but these errors were encountered: