From 05f105a85ebde2f154a53d39dc5be84310c6f409 Mon Sep 17 00:00:00 2001 From: Fabio Buso Date: Tue, 17 Dec 2024 11:29:38 +0100 Subject: [PATCH 1/2] Updates to the GCP K8s installation documentation --- .../setup_installation/aws/getting_started.md | 11 ++- .../setup_installation/gcp/getting_started.md | 70 +++++++++---------- 2 files changed, 38 insertions(+), 43 deletions(-) diff --git a/docs/setup_installation/aws/getting_started.md b/docs/setup_installation/aws/getting_started.md index e1626bff8..7ccb5c0a3 100644 --- a/docs/setup_installation/aws/getting_started.md +++ b/docs/setup_installation/aws/getting_started.md @@ -8,21 +8,20 @@ SageMaker and KubeFlow. This guide shows how to set up the Hopsworks platform in To follow the instruction on this page you will need the following: -- Kubernetes Version: Hopsworks can be deployed on AKS clusters running Kubernetes >= 1.27.0. +- Kubernetes Version: Hopsworks can be deployed on EKS clusters running Kubernetes >= 1.27.0. - [aws-cli](https://aws.amazon.com/cli/) to provision the AWS resources - [eksctl](https://eksctl.io/) to interact with the AWS APIs and provision the EKS cluster - [helm](https://helm.sh/) to deploy Hopsworks -## ECR Registry +### ECR Registry Hopsworks allows users to customize the images used by Python jobs, Jupyter Notebooks and (Py)Spark applications running in their projects. The images are stored in ECR. Hopsworks needs access to an ECR repository to push the project images. -## Permissions +### Permissions -By default, the deployment requires cluster admin level access to be able to create a set of ClusterRoles, ServiceAccounts and ClusterRoleBindings. If you don’t have cluster admin level access, you can ask your administrator to provision the necessary ClusterRoles, ServiceAccounts and ClusterRoleBindings as described in the section below. - -A namespace is required to deploy the Hopsworks stack. If you don’t have permissions to create a namespace you should ask your K8s administrator to provision one for you. +- The deployment requires cluster admin access to create ClusterRoles, ServiceAccounts, and ClusterRoleBindings. +- A namespace is required to deploy the Hopsworks stack. If you don’t have permissions to create a namespace, ask your GKE administrator to provision one. ## EKS Deployment diff --git a/docs/setup_installation/gcp/getting_started.md b/docs/setup_installation/gcp/getting_started.md index 73a943d35..8f6ebfc11 100644 --- a/docs/setup_installation/gcp/getting_started.md +++ b/docs/setup_installation/gcp/getting_started.md @@ -9,15 +9,11 @@ SageMaker and KubeFlow. This guide shows how to set up the Hopsworks platform in To follow the instruction on this page you will need the following: -- Kubernetes Version: Hopsworks can be deployed on AKS clusters running Kubernetes >= 1.27.0. -- The [gcloud CLI](https://cloud.google.com/sdk/gcloud) -- The [gsutil tool](https://cloud.google.com/storage/docs/gsutil) -- kubectl (to manage the AKS cluster) -- helm (to deploy Hopsworks) +- Kubernetes Version: Hopsworks can be deployed on GKE clusters running Kubernetes >= 1.27.0. +- [gcloud CLI](https://cloud.google.com/sdk/gcloud) to provision the GCP resources +- [gke-gcloud-auth-plugin](https://cloud.google.com/blog/products/containers-kubernetes/kubectl-auth-changes-in-gke) to manage authentication with the GKE cluster +- [helm](https://helm.sh/) to deploy Hopsworks -## GCR Registry - -Hopsworks allows users to customize images for Python jobs, Jupyter Notebooks, and (Py)Spark applications. These images should be stored in Google Container Registry (GCR). The GKE cluster needs access to a GCR repository to push project images. ### Permissions @@ -38,7 +34,7 @@ gsutil mb -l $region gs://$bucket_name ### Step 1.2: Create Service Account -Create a file named hopsworksai_role.yaml with the following content: +Create a file named `hopsworksai_role.yaml` with the following content: ```bash title: Hopsworks AI Instances @@ -67,56 +63,54 @@ includedPermissions: Execute the following gcloud command to create a custom role from the file. Replace $PROJECT_ID with your GCP project id: ```bash -gcloud iam roles create hopsworksai_instances --project=$PROJECT_ID --file=hopsworksai_role.yaml +gcloud iam roles create hopsworksai_instances \ + --project=$PROJECT_ID \ + --file=hopsworksai_role.yaml ``` -Create a service account: - Execute the following gcloud command to create a service account for Hopsworks AI instances. Replace $PROJECT_ID with your GCP project id: ```bash -gcloud iam service-accounts create hopsworksai_instances --project=$PROJECT_ID --description="Service account for Hopsworks AI instances" --display-name="Hopsworks AI instances" +gcloud iam service-accounts create hopsworksai_instances \ + --project=$PROJECT_ID \ + --description="Service account for Hopsworks AI instances" \ + --display-name="Hopsworks AI instances" ``` Execute the following gcloud command to bind the custom role to the service account. Replace all occurrences $PROJECT_ID with your GCP project id: ```bash -gcloud projects add-iam-policy-binding $PROJECT_ID --member="serviceAccount:hopsworks-ai-instances@$PROJECT_ID.iam.gserviceaccount.com" --role="projects/$PROJECT_ID/roles/hopsworksai_instances" +gcloud projects add-iam-policy-binding $PROJECT_ID \ + --member="serviceAccount:hopsworks-ai-instances@$PROJECT_ID.iam.gserviceaccount.com" \ + --role="projects/$PROJECT_ID/roles/hopsworksai_instances" ``` - ### Step 1.3: Create a GKE Cluster ```bash -gcloud container clusters create --zone --machine-type n2-standard-8 --num-nodes 3 --enable-ip-alias --service-account my-service-account@my-project.iam.gserviceaccount.com +gcloud container clusters create \ + --zone \ + --machine-type n2-standard-8 \ + --num-nodes 1 \ + --enable-ip-alias \ + --service-account my-service-account@my-project.iam.gserviceaccount.com ``` - -### Step 1.4: Create GCR repository - -Enable Artifact Registry and create a GCR repository to store images: +Once the creation process is completed, you should be able to access the cluster using the kubectl CLI tool: ```bash -gcloud artifacts repositories create --repository-format=docker --location= +kubectl get nodes ``` -### Step 1.5: Link the GCS bucket and the GCR repository - -```bash -gsutil iam ch serviceAccount:PROJECT_NUMBER-compute@developer.gserviceaccount.com:objectViewer gs://YOUR_BUCKET_NAME -gsutil iam ch serviceAccount:PROJECT_NUMBER-compute@developer.gserviceaccount.com:objectAdmin gs://YOUR_BUCKET_NAME - -gsutil iam ch serviceAccount:YOUR_EMAIL_ADDRESS:objectViewer gs://YOUR_BUCKET_NAME -gsutil iam ch serviceAccount:YOUR_EMAIL_ADDRESS:objectAdmin gs://YOUR_BUCKET_NAME +### Step 1.4: Create GCR repository -gcloud projects add-iam-policy-binding $PROJECT_ID --member="serviceAccount:SERVICE_ACCOUNT_EMAIL" --role="roles/storage.objectViewer" -``` +Hopsworks allows users to customize images for Python jobs, Jupyter Notebooks, and (Py)Spark applications. These images should be stored in Google Container Registry (GCR). The GKE cluster needs access to a GCR repository to push project images. -## Step 2: Configure kubectl +Enable Artifact Registry and create a GCR repository to store images: ```bash -gcloud auth configure-docker - -kubectl get pods +gcloud artifacts repositories create \ + --repository-format=docker \ + --location= ``` ## Step 3: Setup Hopsworks for Deployment @@ -177,7 +171,10 @@ global: Deploy Hopsworks in the created namespace. ```bash -helm install hopsworks hopsworks/hopsworks --namespace hopsworks --values values.gcp.yaml --timeout=600s +helm install hopsworks hopsworks/hopsworks \ + --namespace hopsworks \ + --values values.gcp.yaml \ + --timeout=600s ``` Check that Hopsworks is installing on your provisioned AKS cluster. @@ -194,7 +191,6 @@ Upon completion (circa 20 minutes), setup a load balancer to access Hopsworks: kubectl expose deployment hopsworks --type=LoadBalancer --name=hopsworks-service --namespace ``` - ## Step 5: Next steps Check out our other guides for how to get started with Hopsworks and the Feature Store: From 4a922bc5f9a9afd88ff8235f0af00258a0edf59d Mon Sep 17 00:00:00 2001 From: Fabio Buso Date: Tue, 17 Dec 2024 11:54:24 +0100 Subject: [PATCH 2/2] Fix GKE/EKS mixup --- docs/setup_installation/aws/getting_started.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/setup_installation/aws/getting_started.md b/docs/setup_installation/aws/getting_started.md index 7ccb5c0a3..5c6c538eb 100644 --- a/docs/setup_installation/aws/getting_started.md +++ b/docs/setup_installation/aws/getting_started.md @@ -21,7 +21,7 @@ Hopsworks allows users to customize the images used by Python jobs, Jupyter Note - The deployment requires cluster admin access to create ClusterRoles, ServiceAccounts, and ClusterRoleBindings. -- A namespace is required to deploy the Hopsworks stack. If you don’t have permissions to create a namespace, ask your GKE administrator to provision one. +- A namespace is required to deploy the Hopsworks stack. If you don’t have permissions to create a namespace, ask your EKS administrator to provision one. ## EKS Deployment