-
Notifications
You must be signed in to change notification settings - Fork 551
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[k8s] SkyServe on Kubernetes (#3377)
* playing around * wip with hacks * wip refactor get_endpoints * working get_endpoints * wip * fixed circular import * Working for ingress and loadbalancer svc * lint * add purging from #3094 * Use local catalog on the controller too * use externalip if available * add dshm_size_limit * optimize dependency installation * Add todo * optimize ingress * fix * fix * remove autostop timing * Fix URLs for raw IP:ports * fixes * wip * SA wip * Allow use of service accounts through remote_identity field * Make purge work for no clusters in kubeconfig * Handle ingress namespace not present * setup optimizations and critical SA key fix * fix docs * fix docs * Add support for skypilot.co/external-ip annotation for ingress * Remove dshm_size_limit * Undo kind changes * Update service account docs * minor docs * update comment * is_same_cloud to cloud_in_list * refactor query_ports to use head_ip * autodown + http prefixing in callers * fix ssh key issues when user hash is reused * linting * lint * lint, HOST_CONTROLLERS * add serve smoke tests for k8s * disallow file_mounts and workdir if no storage cloud is enabled * minor * lint * update fastchat to use --host 127.0.0.1 * extend timeout * docs comments * rename to port * add to core.py * docstrs * add docs on exec based auth * expand elif * add lb comment * refactor * refactor * fix docs build * add PODIP mode support * make ssh services optional * nits * Revert "make ssh services optional" This reverts commit 87d4d25. * Revert "add PODIP mode support" This reverts commit 750d4d4. * nits * use 0.0.0.0 when on k8s; use common impl for other clouds * return dict instead of raising errors in core.endpoints() * lint * merge fixes * merge fixes * merge fixes * lint * fix smoke tests * fix smoke tests * comment * add enum for remote identity * lint * add skip_status_check * remove zone requirement * fix timings for test * silence curl download * move jq from yaml to test_minimal * move jq from yaml to test_minimal
- Loading branch information
1 parent
d09827b
commit 0a03995
Showing
47 changed files
with
1,227 additions
and
241 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -20,3 +20,4 @@ Table of Contents | |
aws | ||
gcp | ||
vsphere | ||
kubernetes |
234 changes: 234 additions & 0 deletions
234
docs/source/cloud-setup/cloud-permissions/kubernetes.rst
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,234 @@ | ||
.. _cloud-permissions-kubernetes: | ||
|
||
Kubernetes | ||
========== | ||
|
||
When running outside your Kubernetes cluster, SkyPilot uses your local ``~/.kube/config`` file | ||
for authentication and creating resources on your Kubernetes cluster. | ||
|
||
When running inside your Kubernetes cluster (e.g., as a Spot controller or Serve controller), | ||
SkyPilot can operate using either of the following three authentication methods: | ||
|
||
1. **Using your local kubeconfig file**: In this case, SkyPilot will | ||
copy your local ``~/.kube/config`` file to the controller pod and use it for | ||
authentication. This is the default method when running inside the cluster, | ||
and no additional configuration is required. | ||
|
||
.. note:: | ||
|
||
If your cluster uses exec based authentication in your ``~/.kube/config`` file | ||
(e.g., GKE uses exec auth by default), SkyPilot may not be able to authenticate using this method. In this case, | ||
consider using the service account methods below. | ||
|
||
2. **Creating a service account**: SkyPilot can automatically create the service | ||
account and roles for itself to manage resources in the Kubernetes cluster. | ||
To use this method, set ``remote_identity: SERVICE_ACCOUNT`` to your | ||
Kubernetes configuration in the :ref:`~/.sky/config.yaml <config-yaml>` file: | ||
|
||
.. code-block:: yaml | ||
kubernetes: | ||
remote_identity: SERVICE_ACCOUNT | ||
For details on the permissions that are granted to the service account, | ||
refer to the `Permissions required for SkyPilot`_ section below. | ||
|
||
3. **Using a custom service account**: If you have a custom service account | ||
with the `necessary permissions <k8s-permissions_>`__, you can configure | ||
SkyPilot to use it by adding this to your :ref:`~/.sky/config.yaml <config-yaml>` file: | ||
|
||
.. code-block:: yaml | ||
kubernetes: | ||
remote_identity: your-service-account-name | ||
.. note:: | ||
|
||
Service account based authentication applies only when the remote SkyPilot | ||
cluster (including spot and serve controller) is launched inside the | ||
Kubernetes cluster. When running outside the cluster (e.g., on AWS), | ||
SkyPilot will use the local ``~/.kube/config`` file for authentication. | ||
|
||
Below are the permissions required by SkyPilot and an example service account YAML that you can use to create a service account with the necessary permissions. | ||
|
||
.. _k8s-permissions: | ||
|
||
Permissions required for SkyPilot | ||
--------------------------------- | ||
|
||
SkyPilot requires permissions equivalent to the following roles to be able to manage the resources in the Kubernetes cluster: | ||
|
||
.. code-block:: yaml | ||
# Namespaced role for the service account | ||
# Required for creating pods, services and other necessary resources in the namespace. | ||
# Note these permissions only apply in the namespace where SkyPilot is deployed. | ||
kind: Role | ||
apiVersion: rbac.authorization.k8s.io/v1 | ||
metadata: | ||
name: sky-sa-role | ||
namespace: default | ||
rules: | ||
- apiGroups: ["*"] | ||
resources: ["*"] | ||
verbs: ["*"] | ||
--- | ||
# ClusterRole for accessing cluster-wide resources. Details for each resource below: | ||
kind: ClusterRole | ||
apiVersion: rbac.authorization.k8s.io/v1 | ||
metadata: | ||
name: sky-sa-cluster-role | ||
namespace: default | ||
labels: | ||
parent: skypilot | ||
rules: | ||
- apiGroups: [""] | ||
resources: ["nodes"] # Required for getting node resources. | ||
verbs: ["get", "list", "watch"] | ||
- apiGroups: ["rbac.authorization.k8s.io"] | ||
resources: ["clusterroles", "clusterrolebindings"] # Required for launching more SkyPilot clusters from within the pod. | ||
verbs: ["get", "list", "watch"] | ||
- apiGroups: ["node.k8s.io"] | ||
resources: ["runtimeclasses"] # Required for autodetecting the runtime class of the nodes. | ||
verbs: ["get", "list", "watch"] | ||
--- | ||
# Optional: If using ingresses, role for accessing ingress service IP | ||
apiVersion: rbac.authorization.k8s.io/v1 | ||
kind: Role | ||
metadata: | ||
namespace: ingress-nginx | ||
name: sky-sa-role-ingress-nginx | ||
rules: | ||
- apiGroups: [""] | ||
resources: ["services"] | ||
verbs: ["list", "get"] | ||
These roles must apply to both the user account configured in the kubeconfig file and the service account used by SkyPilot (if configured). | ||
|
||
.. _k8s-sa-example: | ||
|
||
Example using Custom Service Account | ||
------------------------------------ | ||
|
||
To create a service account that has the necessary permissions for SkyPilot, you can use the following YAML: | ||
|
||
.. code-block:: yaml | ||
# create-sky-sa.yaml | ||
kind: ServiceAccount | ||
apiVersion: v1 | ||
metadata: | ||
name: sky-sa | ||
namespace: default | ||
labels: | ||
parent: skypilot | ||
--- | ||
# Role for the service account | ||
kind: Role | ||
apiVersion: rbac.authorization.k8s.io/v1 | ||
metadata: | ||
name: sky-sa-role | ||
namespace: default | ||
labels: | ||
parent: skypilot | ||
rules: | ||
- apiGroups: ["*"] # Required for creating pods, services, secrets and other necessary resources in the namespace. | ||
resources: ["*"] | ||
verbs: ["*"] | ||
--- | ||
# RoleBinding for the service account | ||
kind: RoleBinding | ||
apiVersion: rbac.authorization.k8s.io/v1 | ||
metadata: | ||
name: sky-sa-rb | ||
namespace: default | ||
labels: | ||
parent: skypilot | ||
subjects: | ||
- kind: ServiceAccount | ||
name: sky-sa | ||
roleRef: | ||
kind: Role | ||
name: sky-sa-role | ||
apiGroup: rbac.authorization.k8s.io | ||
--- | ||
# Role for accessing ingress resources | ||
apiVersion: rbac.authorization.k8s.io/v1 | ||
kind: Role | ||
metadata: | ||
namespace: ingress-nginx | ||
name: sky-sa-role-ingress-nginx | ||
rules: | ||
- apiGroups: [""] | ||
resources: ["services"] | ||
verbs: ["list", "get", "watch"] | ||
- apiGroups: ["rbac.authorization.k8s.io"] | ||
resources: ["roles", "rolebindings"] | ||
verbs: ["list", "get", "watch"] | ||
--- | ||
# RoleBinding for accessing ingress resources | ||
apiVersion: rbac.authorization.k8s.io/v1 | ||
kind: RoleBinding | ||
metadata: | ||
name: sky-sa-rolebinding-ingress-nginx | ||
namespace: ingress-nginx | ||
subjects: | ||
- kind: ServiceAccount | ||
name: sky-sa | ||
namespace: default | ||
roleRef: | ||
kind: Role | ||
name: sky-sa-role-ingress-nginx | ||
apiGroup: rbac.authorization.k8s.io | ||
--- | ||
# ClusterRole for the service account | ||
kind: ClusterRole | ||
apiVersion: rbac.authorization.k8s.io/v1 | ||
metadata: | ||
name: sky-sa-cluster-role | ||
namespace: default | ||
labels: | ||
parent: skypilot | ||
rules: | ||
- apiGroups: [""] | ||
resources: ["nodes"] # Required for getting node resources. | ||
verbs: ["get", "list", "watch"] | ||
- apiGroups: ["rbac.authorization.k8s.io"] | ||
resources: ["clusterroles", "clusterrolebindings"] # Required for launching more SkyPilot clusters from within the pod. | ||
verbs: ["get", "list", "watch"] | ||
- apiGroups: ["node.k8s.io"] | ||
resources: ["runtimeclasses"] # Required for autodetecting the runtime class of the nodes. | ||
verbs: ["get", "list", "watch"] | ||
- apiGroups: ["networking.k8s.io"] # Required for exposing services. | ||
resources: ["ingressclasses"] | ||
verbs: ["get", "list", "watch"] | ||
--- | ||
# ClusterRoleBinding for the service account | ||
apiVersion: rbac.authorization.k8s.io/v1 | ||
kind: ClusterRoleBinding | ||
metadata: | ||
name: sky-sa-cluster-role-binding | ||
namespace: default | ||
labels: | ||
parent: skypilot | ||
subjects: | ||
- kind: ServiceAccount | ||
name: sky-sa | ||
namespace: default | ||
roleRef: | ||
kind: ClusterRole | ||
name: sky-sa-cluster-role | ||
apiGroup: rbac.authorization.k8s.io | ||
Create the service account using the following command: | ||
|
||
.. code-block:: bash | ||
$ kubectl apply -f create-sky-sa.yaml | ||
After creating the service account, configure SkyPilot to use it through ``~/.sky/config.yaml``: | ||
|
||
.. code-block:: yaml | ||
kubernetes: | ||
remote_identity: sky-sa # Or your service account name |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.