-
Notifications
You must be signed in to change notification settings - Fork 534
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[k8s] GPU Feature discovery label formatter #3493
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is awesome @asaiacai! It looks very reasonable to me. @romilbhardwaj for another look to make sure it does not break our other formatters : )
Co-authored-by: Zhanghao Wu <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @asaiacai!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @asaiacai. Tested on A100 and H100 from Lambda. Left a comment about documenting that this labelformatter cannot be used with autoscaling, otherwise lgtm!
just added the docstring @romilbhardwaj . Thanks for the review! lmk if i this needs anything else. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @asaiacai!
* GFDLabel formatter for k8s * update comment * format * substring match against k8s labels instead of strict matching * cleanup * use k8s label * map k8s label value to accelerator instead of accelerator to label value * remove unused get_gke_accelerator_name * remove get acc from value func * pattern match against A100' * pattern match against A100' * format * fix typo * format * re.search * compare strings * add P4000 * format * lower case for check Co-authored-by: Zhanghao Wu <[email protected]> * force upper case * match skypilot labeler logic * format.sh * add docstring * fix class docstring * grammar fix * format --------- Co-authored-by: Zhanghao Wu <[email protected]>
Resolves #2460
This allows k8s to consume the node label
nvidia.com/gpu.product
created by GPU feature discovery which is commonly deployed through the NVIDIA GPU operatorTested (run the relevant ones):
bash format.sh
eks_test_cluster.yaml
k3s
withgpu-operator
usingdeploy_k3s.sh
modified to exclude the skypilot k8s labeler, ensure the following can run