Skip to content

Commit

Permalink
Updates to the GCP K8s installation documentation (#430)
Browse files Browse the repository at this point in the history
* Updates to the GCP K8s installation documentation

* Fix GKE/EKS mixup
  • Loading branch information
SirOibaf committed Dec 19, 2024
1 parent 191180c commit 3ea3f5d
Show file tree
Hide file tree
Showing 2 changed files with 38 additions and 43 deletions.
11 changes: 5 additions & 6 deletions docs/setup_installation/aws/getting_started.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,21 +8,20 @@ SageMaker and KubeFlow. This guide shows how to set up the Hopsworks platform in

To follow the instruction on this page you will need the following:

- Kubernetes Version: Hopsworks can be deployed on AKS clusters running Kubernetes >= 1.27.0.
- Kubernetes Version: Hopsworks can be deployed on EKS clusters running Kubernetes >= 1.27.0.
- [aws-cli](https://aws.amazon.com/cli/) to provision the AWS resources
- [eksctl](https://eksctl.io/) to interact with the AWS APIs and provision the EKS cluster
- [helm](https://helm.sh/) to deploy Hopsworks

## ECR Registry
### ECR Registry

Hopsworks allows users to customize the images used by Python jobs, Jupyter Notebooks and (Py)Spark applications running in their projects. The images are stored in ECR. Hopsworks needs access to an ECR repository to push the project images.

## Permissions
### Permissions

By default, the deployment requires cluster admin level access to be able to create a set of ClusterRoles, ServiceAccounts and ClusterRoleBindings. If you don’t have cluster admin level access, you can ask your administrator to provision the necessary ClusterRoles, ServiceAccounts and ClusterRoleBindings as described in the section below.

A namespace is required to deploy the Hopsworks stack. If you don’t have permissions to create a namespace you should ask your K8s administrator to provision one for you.
- The deployment requires cluster admin access to create ClusterRoles, ServiceAccounts, and ClusterRoleBindings.

- A namespace is required to deploy the Hopsworks stack. If you don’t have permissions to create a namespace, ask your EKS administrator to provision one.

## EKS Deployment

Expand Down
70 changes: 33 additions & 37 deletions docs/setup_installation/gcp/getting_started.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,15 +9,11 @@ SageMaker and KubeFlow. This guide shows how to set up the Hopsworks platform in

To follow the instruction on this page you will need the following:

- Kubernetes Version: Hopsworks can be deployed on AKS clusters running Kubernetes >= 1.27.0.
- The [gcloud CLI](https://cloud.google.com/sdk/gcloud)
- The [gsutil tool](https://cloud.google.com/storage/docs/gsutil)
- kubectl (to manage the AKS cluster)
- helm (to deploy Hopsworks)
- Kubernetes Version: Hopsworks can be deployed on GKE clusters running Kubernetes >= 1.27.0.
- [gcloud CLI](https://cloud.google.com/sdk/gcloud) to provision the GCP resources
- [gke-gcloud-auth-plugin](https://cloud.google.com/blog/products/containers-kubernetes/kubectl-auth-changes-in-gke) to manage authentication with the GKE cluster
- [helm](https://helm.sh/) to deploy Hopsworks

## GCR Registry

Hopsworks allows users to customize images for Python jobs, Jupyter Notebooks, and (Py)Spark applications. These images should be stored in Google Container Registry (GCR). The GKE cluster needs access to a GCR repository to push project images.

### Permissions

Expand All @@ -38,7 +34,7 @@ gsutil mb -l $region gs://$bucket_name

### Step 1.2: Create Service Account

Create a file named hopsworksai_role.yaml with the following content:
Create a file named `hopsworksai_role.yaml` with the following content:

```bash
title: Hopsworks AI Instances
Expand Down Expand Up @@ -67,56 +63,54 @@ includedPermissions:
Execute the following gcloud command to create a custom role from the file. Replace $PROJECT_ID with your GCP project id:

```bash
gcloud iam roles create hopsworksai_instances --project=$PROJECT_ID --file=hopsworksai_role.yaml
gcloud iam roles create hopsworksai_instances \
--project=$PROJECT_ID \
--file=hopsworksai_role.yaml
```

Create a service account:

Execute the following gcloud command to create a service account for Hopsworks AI instances. Replace $PROJECT_ID with your GCP project id:

```bash
gcloud iam service-accounts create hopsworksai_instances --project=$PROJECT_ID --description="Service account for Hopsworks AI instances" --display-name="Hopsworks AI instances"
gcloud iam service-accounts create hopsworksai_instances \
--project=$PROJECT_ID \
--description="Service account for Hopsworks AI instances" \
--display-name="Hopsworks AI instances"
```

Execute the following gcloud command to bind the custom role to the service account. Replace all occurrences $PROJECT_ID with your GCP project id:

```bash
gcloud projects add-iam-policy-binding $PROJECT_ID --member="serviceAccount:hopsworks-ai-instances@$PROJECT_ID.iam.gserviceaccount.com" --role="projects/$PROJECT_ID/roles/hopsworksai_instances"
gcloud projects add-iam-policy-binding $PROJECT_ID \
--member="serviceAccount:hopsworks-ai-instances@$PROJECT_ID.iam.gserviceaccount.com" \
--role="projects/$PROJECT_ID/roles/hopsworksai_instances"
```


### Step 1.3: Create a GKE Cluster

```bash
gcloud container clusters create <cluster-name> --zone <zone> --machine-type n2-standard-8 --num-nodes 3 --enable-ip-alias --service-account [email protected]
gcloud container clusters create <cluster-name> \
--zone <zone> \
--machine-type n2-standard-8 \
--num-nodes 1 \
--enable-ip-alias \
--service-account [email protected]
```

### Step 1.4: Create GCR repository

Enable Artifact Registry and create a GCR repository to store images:
Once the creation process is completed, you should be able to access the cluster using the kubectl CLI tool:

```bash
gcloud artifacts repositories create <repo-name> --repository-format=docker --location=<region>
kubectl get nodes
```

### Step 1.5: Link the GCS bucket and the GCR repository

```bash
gsutil iam ch serviceAccount:[email protected]:objectViewer gs://YOUR_BUCKET_NAME
gsutil iam ch serviceAccount:[email protected]:objectAdmin gs://YOUR_BUCKET_NAME

gsutil iam ch serviceAccount:YOUR_EMAIL_ADDRESS:objectViewer gs://YOUR_BUCKET_NAME
gsutil iam ch serviceAccount:YOUR_EMAIL_ADDRESS:objectAdmin gs://YOUR_BUCKET_NAME
### Step 1.4: Create GCR repository

gcloud projects add-iam-policy-binding $PROJECT_ID --member="serviceAccount:SERVICE_ACCOUNT_EMAIL" --role="roles/storage.objectViewer"
```
Hopsworks allows users to customize images for Python jobs, Jupyter Notebooks, and (Py)Spark applications. These images should be stored in Google Container Registry (GCR). The GKE cluster needs access to a GCR repository to push project images.

## Step 2: Configure kubectl
Enable Artifact Registry and create a GCR repository to store images:

```bash
gcloud auth configure-docker

kubectl get pods
gcloud artifacts repositories create <repo-name> \
--repository-format=docker \
--location=<region>
```

## Step 3: Setup Hopsworks for Deployment
Expand Down Expand Up @@ -177,7 +171,10 @@ global:
Deploy Hopsworks in the created namespace.

```bash
helm install hopsworks hopsworks/hopsworks --namespace hopsworks --values values.gcp.yaml --timeout=600s
helm install hopsworks hopsworks/hopsworks \
--namespace hopsworks \
--values values.gcp.yaml \
--timeout=600s
```

Check that Hopsworks is installing on your provisioned AKS cluster.
Expand All @@ -194,7 +191,6 @@ Upon completion (circa 20 minutes), setup a load balancer to access Hopsworks:
kubectl expose deployment hopsworks --type=LoadBalancer --name=hopsworks-service --namespace <namespace>
```


## Step 5: Next steps

Check out our other guides for how to get started with Hopsworks and the Feature Store:
Expand Down

0 comments on commit 3ea3f5d

Please sign in to comment.