-
Notifications
You must be signed in to change notification settings - Fork 532
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[k8s] Robust service account and namespace support #3632
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for fixing this @romilbhardwaj! Left several comments : )
After creating the service account, the cluster admin may distribute kubeconfigs with the ``sky-sa`` service account to users who need to access the cluster. | ||
|
||
Users should also configure SkyPilot to use the ``sky-sa`` service account through ``~/.sky/config.yaml``: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we change the sky-sa
name above the same as the default name in our codebase, so after creation, the user does not need to specify it in a ~/.sky/config.yaml
(we can still mention that a user can change the name of the service account and specify the config yaml, but we should keep the service account creation with the same name as the hardcoded one)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using the default service account skypilot-service-account
requires additional permissions because our code is self-correcting - it automatically fixes any service account misconfigurations by creating/patching resources (e.g., when user is upgrading versions or has accidentally deleted k8s RBAC resources created by SkyPilot).
- Our code will inspect and create/update the necessary roles and rolebindings. This requires additional
"create", "patch"
permissions on"clusterroles", "clusterrolebindings"
. - Our code will inspect and create the
skypilot-system
namespace if it doesn't exist. This requires additional"list", "create"
permissions onnamespaces
.
These extra permissions on "clusterroles", "clusterrolebindings", "namespaces"
may be considered too permissive in some environments (e.g., shared clusters). Perhaps we should keep the permissions here limited to minimal permissions required?
@@ -18,7 +18,7 @@ SkyPilot's Kubernetes support is designed to work with most Kubernetes distribut | |||
To connect to a Kubernetes cluster, SkyPilot needs: | |||
|
|||
* An existing Kubernetes cluster running Kubernetes v1.20 or later. | |||
* A `Kubeconfig <kubeconfig>`_ file containing access credentials and namespace to be used. | |||
* A `Kubeconfig <kubeconfig>`_ file containing access credentials and namespace to be used. Refer to :ref:`required permissions <cloud-permissions-kubernetes>` guide for details. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about:
* A `Kubeconfig <kubeconfig>`_ file containing access credentials and namespace to be used. Refer to :ref:`required permissions <cloud-permissions-kubernetes>` guide for details. | |
* A `Kubeconfig <kubeconfig>`_ file containing access credentials and namespace to be used. To reduce the permissions for a user, check :ref:`required permissions <cloud-permissions-kubernetes>` for details. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Updated. I'm also reworking this page in a separate PR.
Thanks @Michaelvll. Addressed comments and updated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for fixing those permissions @romilbhardwaj! This is very important for our k8s users. : )
# * Specify SKYPILOT_NAMESPACE env var to override the default namespace | ||
# * Specify SKYPILOT_SA_NAME env var to override the default service account name | ||
# * Specify SKIP_SA_CREATION=1 to skip creating the service account and use an existing one | ||
# | ||
# Usage: | ||
# # Create "sky-sa" service account with minimal permissions in "default" namespace and generate kubeconfig | ||
# $ ./generate_static_kubeconfig.sh | ||
# | ||
# # Create "my-sa" account with minimal permissions in "my-namespace" namespace and generate kubeconfig | ||
# $ SKYPILOT_SA_NAME=my-sa SKYPILOT_NAMESPACE=my-namespace ./generate_static_kubeconfig.sh | ||
# | ||
# # Use an existing service account "my-sa" in "my-namespace" namespace and generate kubeconfig | ||
# $ SKIP_SA_CREATION=1 SKYPILOT_SA_NAME=my-sa SKYPILOT_NAMESPACE=my-namespace ./generate_static_kubeconfig.sh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we include the usage of this usage in the doc as well? So as to make it easier for the user to create the service account?
The doc for minimal permission above could serve as a deep dive into what this shell script do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
absolutely - I'm working on it in a separate for revamping k8s cluster admin docs :)
* WIP * Working permissions * lint * comments and update generate_static_kubeconfig.sh
This PR has a few updates for k8s SA and namespace support:
Code:
skypilot-system
namespace creation to happen only if the defaultSERVICE_ACCOUNT
remote_identity is used. This is to reduce the scope of permissions required if a SA already exists.default
namespace so this bug was not caught.Docs:
Ran the following tests for two scenarios on a GKE cluster:
default
namespace. This is a base case to verify existing functionality does not break.myns
namespace following the create-sky-sa.yaml example in the docs, creates a kubeconfig that uses this namespace and service account to authenticate then uses this kubeconfig to run SkyPilot in themyns
namespace withremote_identity: sky-sa
.pytest -v tests/test_smoke.py::test_managed_jobs_storage --kubernetes