Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AWS cluster name restrictions should've been relaxed #2709

Closed
concretevitamin opened this issue Oct 15, 2023 · 5 comments · Fixed by #3130
Closed

AWS cluster name restrictions should've been relaxed #2709

concretevitamin opened this issue Oct 15, 2023 · 5 comments · Fixed by #3130

Comments

@concretevitamin
Copy link
Member

concretevitamin commented Oct 15, 2023

» sky launch --cloud aws --down -c ABC            
..

sky.exceptions.InvalidClusterNameError: Cluster name "ABC" is invalid; ensure it is fully matched by regex (e.g., only contains lower letters, numbers and dash): [a-z]([-a-z0-9]*[a-z0-9])?

IIRC, we have recently added a local cluster name -> cluster name on cloud indirection, so the local cluster names should not be restricted to that regex anymore?

UPDATE: solution's more involved than just relying on the current "truncate + append hash" mapping. See

#2743 (comment)

@dtran24
Copy link
Contributor

dtran24 commented Oct 29, 2023

Opened a PR here #2743

@concretevitamin
Copy link
Member Author

concretevitamin commented Nov 3, 2023

Brainstorming out loud some considerations in this problem:

(1) We want to support some common cluster names that aren't currently supported like:

  • uppercase letters: "LoRA", "Bert"
  • underscores within the name: "seed_1"
  • periods within the name: "cuda11.8"

(2) Another desired property is whatever mapping we do to produce cluster_name_on_cloud, it should try to show the original name as much as possible, so that users can easily spot the VMs belong to a cluster in the cloud consoles.

In other words the mapping shouldn't be completely random like LoRA -> <some hash> but rather maybe LoRA -> lora, seed_1 -> seed1 or something like that.

@dtran24
Copy link
Contributor

dtran24 commented Nov 5, 2023

To fulfill these requirements, we can:

  1. Lowercase all letters
  2. Remove all characters that are not letters, numbers or hyphens

Seems like the cluster name regex also wants to enforce that the name starts with a letter. If this condition is not satisfied by the user with the local name (e.g. "2bert"), we can prepend a letter to make the cloud cluster name valid

@concretevitamin
Copy link
Member Author

Sounds like a solid start! I added another case to the above (periods). Have you thought about the following?

  • After our processing, we feed the processed name to the generic cluster name regex and/or cloud-specific checks as additional safeguards?
  • Let's say -c LoRA, -c lora, and -c lo.ra both produce an internal name lora. Should we treat these three CLI arguments as referring to the same cluster or different clusters?

@dtran24
Copy link
Contributor

dtran24 commented Nov 7, 2023

Nice!

  • If the code is already there, think it makes sense to keep it there as a start and remove it once the change in this PR has been successfully used in production for a bit. Ensures backward compatibility, keeps the changes smaller, and limits the blast radius
  • I believe they should be treated differently. If -c LoRa, -c lora, and -c lo.ra were all used, I would imagine these would all be different users since the naming conventions are different. Different users would want to use a different cluster if the naming is not the same. If we were to go down this path, one approach to differentiating the cluster names could be to hash the characters not in the cleaned CLI argument (e.g. for -c LoRA, hash LRA. For -c lo.ra, hash .)

What are your thoughts?

I'll look into updating my previous PR with these considerations soon

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants