Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updates to the GCP K8s installation documentation #430

Merged
merged 2 commits into from
Dec 19, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 5 additions & 6 deletions docs/setup_installation/aws/getting_started.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,21 +8,20 @@ SageMaker and KubeFlow. This guide shows how to set up the Hopsworks platform in

To follow the instruction on this page you will need the following:

- Kubernetes Version: Hopsworks can be deployed on AKS clusters running Kubernetes >= 1.27.0.
- Kubernetes Version: Hopsworks can be deployed on EKS clusters running Kubernetes >= 1.27.0.
- [aws-cli](https://aws.amazon.com/cli/) to provision the AWS resources
- [eksctl](https://eksctl.io/) to interact with the AWS APIs and provision the EKS cluster
- [helm](https://helm.sh/) to deploy Hopsworks

## ECR Registry
### ECR Registry

Hopsworks allows users to customize the images used by Python jobs, Jupyter Notebooks and (Py)Spark applications running in their projects. The images are stored in ECR. Hopsworks needs access to an ECR repository to push the project images.

## Permissions
### Permissions

By default, the deployment requires cluster admin level access to be able to create a set of ClusterRoles, ServiceAccounts and ClusterRoleBindings. If you don’t have cluster admin level access, you can ask your administrator to provision the necessary ClusterRoles, ServiceAccounts and ClusterRoleBindings as described in the section below.

A namespace is required to deploy the Hopsworks stack. If you don’t have permissions to create a namespace you should ask your K8s administrator to provision one for you.
- The deployment requires cluster admin access to create ClusterRoles, ServiceAccounts, and ClusterRoleBindings.

- A namespace is required to deploy the Hopsworks stack. If you don’t have permissions to create a namespace, ask your EKS administrator to provision one.

## EKS Deployment

Expand Down
70 changes: 33 additions & 37 deletions docs/setup_installation/gcp/getting_started.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,15 +9,11 @@ SageMaker and KubeFlow. This guide shows how to set up the Hopsworks platform in

To follow the instruction on this page you will need the following:

- Kubernetes Version: Hopsworks can be deployed on AKS clusters running Kubernetes >= 1.27.0.
- The [gcloud CLI](https://cloud.google.com/sdk/gcloud)
- The [gsutil tool](https://cloud.google.com/storage/docs/gsutil)
- kubectl (to manage the AKS cluster)
- helm (to deploy Hopsworks)
- Kubernetes Version: Hopsworks can be deployed on GKE clusters running Kubernetes >= 1.27.0.
- [gcloud CLI](https://cloud.google.com/sdk/gcloud) to provision the GCP resources
- [gke-gcloud-auth-plugin](https://cloud.google.com/blog/products/containers-kubernetes/kubectl-auth-changes-in-gke) to manage authentication with the GKE cluster
- [helm](https://helm.sh/) to deploy Hopsworks

## GCR Registry

Hopsworks allows users to customize images for Python jobs, Jupyter Notebooks, and (Py)Spark applications. These images should be stored in Google Container Registry (GCR). The GKE cluster needs access to a GCR repository to push project images.

### Permissions

Expand All @@ -38,7 +34,7 @@ gsutil mb -l $region gs://$bucket_name

### Step 1.2: Create Service Account

Create a file named hopsworksai_role.yaml with the following content:
Create a file named `hopsworksai_role.yaml` with the following content:

```bash
title: Hopsworks AI Instances
Expand Down Expand Up @@ -67,56 +63,54 @@ includedPermissions:
Execute the following gcloud command to create a custom role from the file. Replace $PROJECT_ID with your GCP project id:

```bash
gcloud iam roles create hopsworksai_instances --project=$PROJECT_ID --file=hopsworksai_role.yaml
gcloud iam roles create hopsworksai_instances \
--project=$PROJECT_ID \
--file=hopsworksai_role.yaml
```

Create a service account:

Execute the following gcloud command to create a service account for Hopsworks AI instances. Replace $PROJECT_ID with your GCP project id:

```bash
gcloud iam service-accounts create hopsworksai_instances --project=$PROJECT_ID --description="Service account for Hopsworks AI instances" --display-name="Hopsworks AI instances"
gcloud iam service-accounts create hopsworksai_instances \
--project=$PROJECT_ID \
--description="Service account for Hopsworks AI instances" \
--display-name="Hopsworks AI instances"
```

Execute the following gcloud command to bind the custom role to the service account. Replace all occurrences $PROJECT_ID with your GCP project id:

```bash
gcloud projects add-iam-policy-binding $PROJECT_ID --member="serviceAccount:hopsworks-ai-instances@$PROJECT_ID.iam.gserviceaccount.com" --role="projects/$PROJECT_ID/roles/hopsworksai_instances"
gcloud projects add-iam-policy-binding $PROJECT_ID \
--member="serviceAccount:hopsworks-ai-instances@$PROJECT_ID.iam.gserviceaccount.com" \
--role="projects/$PROJECT_ID/roles/hopsworksai_instances"
```


### Step 1.3: Create a GKE Cluster

```bash
gcloud container clusters create <cluster-name> --zone <zone> --machine-type n2-standard-8 --num-nodes 3 --enable-ip-alias --service-account [email protected]
gcloud container clusters create <cluster-name> \
--zone <zone> \
--machine-type n2-standard-8 \
--num-nodes 1 \
--enable-ip-alias \
--service-account [email protected]
```

### Step 1.4: Create GCR repository

Enable Artifact Registry and create a GCR repository to store images:
Once the creation process is completed, you should be able to access the cluster using the kubectl CLI tool:

```bash
gcloud artifacts repositories create <repo-name> --repository-format=docker --location=<region>
kubectl get nodes
```

### Step 1.5: Link the GCS bucket and the GCR repository

```bash
gsutil iam ch serviceAccount:[email protected]:objectViewer gs://YOUR_BUCKET_NAME
gsutil iam ch serviceAccount:[email protected]:objectAdmin gs://YOUR_BUCKET_NAME

gsutil iam ch serviceAccount:YOUR_EMAIL_ADDRESS:objectViewer gs://YOUR_BUCKET_NAME
gsutil iam ch serviceAccount:YOUR_EMAIL_ADDRESS:objectAdmin gs://YOUR_BUCKET_NAME
### Step 1.4: Create GCR repository

gcloud projects add-iam-policy-binding $PROJECT_ID --member="serviceAccount:SERVICE_ACCOUNT_EMAIL" --role="roles/storage.objectViewer"
```
Hopsworks allows users to customize images for Python jobs, Jupyter Notebooks, and (Py)Spark applications. These images should be stored in Google Container Registry (GCR). The GKE cluster needs access to a GCR repository to push project images.

## Step 2: Configure kubectl
Enable Artifact Registry and create a GCR repository to store images:

```bash
gcloud auth configure-docker

kubectl get pods
gcloud artifacts repositories create <repo-name> \
--repository-format=docker \
--location=<region>
```

## Step 3: Setup Hopsworks for Deployment
Expand Down Expand Up @@ -177,7 +171,10 @@ global:
Deploy Hopsworks in the created namespace.

```bash
helm install hopsworks hopsworks/hopsworks --namespace hopsworks --values values.gcp.yaml --timeout=600s
helm install hopsworks hopsworks/hopsworks \
--namespace hopsworks \
--values values.gcp.yaml \
--timeout=600s
```

Check that Hopsworks is installing on your provisioned AKS cluster.
Expand All @@ -194,7 +191,6 @@ Upon completion (circa 20 minutes), setup a load balancer to access Hopsworks:
kubectl expose deployment hopsworks --type=LoadBalancer --name=hopsworks-service --namespace <namespace>
```


## Step 5: Next steps

Check out our other guides for how to get started with Hopsworks and the Feature Store:
Expand Down
Loading