Skip to content

Commit

Permalink
Merge remote-tracking branch 'origin/main' into ray_security_context
Browse files Browse the repository at this point in the history
  • Loading branch information
bjornsen committed Mar 7, 2024
2 parents 64e0cdf + 775407b commit dc323d2
Show file tree
Hide file tree
Showing 16 changed files with 234 additions and 103 deletions.
50 changes: 24 additions & 26 deletions applications/rag/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -93,8 +93,8 @@ gcloud container clusters get-credentials ${CLUSTER_NAME:?} --location ${CLUSTER
2. Verify Jupyterhub service is setup:
* Fetch the service IP/Domain:
* IAP disabled: `kubectl get services proxy-public -n $NAMESPACE --output jsonpath='{.status.loadBalancer.ingress[0].ip}'`
* IAP enabled: Read terraform output `jupyter_uri` or use command: `kubectl get managedcertificates jupyter-managed-cert -n $NAMESPACE --output jsonpath='{.status.domainStatus[0].domain}'`
* Remember login [Google Cloud Platform IAP](https://pantheon.corp.google.com/security/iap) to check if user has role `IAP-secured Web App User`
* IAP enabled: Read terraform output: `terraform output jupyter_uri` or use command: `kubectl get managedcertificates jupyter-managed-cert -n $NAMESPACE --output jsonpath='{.status.domainStatus[0].domain}'`
* From [Google Cloud Platform IAP](https://pantheon.corp.google.com/security/iap), check if the allowlisted user has role `IAP-secured Web App User`
* Wait for domain status to be `Active`
* Go to the IP in a browser which should display the Jupyterlab login UI.

Expand Down Expand Up @@ -137,37 +137,35 @@ EOF

### Vector Embeddings for Dataset

Choose a password for your CloudSQL user:
```
SQL_PASSWORD=
```

This step generates the vector embeddings for your input dataset. Currently, the default dataset is [Google Maps Restaurant Reviews](https://www.kaggle.com/datasets/denizbilginn/google-maps-restaurant-reviews). We will use a Jupyter notebook to run a Ray job that generates the embeddings & populates them into the instance `pgvector-instance` created above.

1. Create a CloudSQL user to access the database: `gcloud sql users create rag-user-notebook --password=${SQL_PASSWORD:?} --instance=pgvector-instance --host=%`

2. Go to the Jupyterhub service endpoint in a browser:
1. Fetch the Jupyterhub service endpoint & navigate to it in a browser. This should display the JupyterLab login UI:
* IAP disabled: `kubectl get services proxy-public -n $NAMESPACE --output jsonpath='{.status.loadBalancer.ingress[0].ip}'`
* IAP enabled: Read terraform output `jupyter_uri` or use command: `kubectl get managedcertificates jupyter-managed-cert -n $NAMESPACE --output jsonpath='{.status.domainStatus[0].domain}'`
* Open Google Cloud Console IAM to verify that the user has role `IAP-secured Web App User`
* IAP enabled: Read terraform output: `terraform output jupyter_uri` or use command: `kubectl get managedcertificates jupyter-managed-cert -n $NAMESPACE --output jsonpath='{.status.domainStatus[0].domain}'`
* From [Google Cloud Platform IAP](https://pantheon.corp.google.com/security/iap), check if the allowlisted user has role `IAP-secured Web App User`.
* Wait for the domain status to be `Active`
3. Login with placeholder credentials [TBD: replace with instructions for IAP]:
* username: user
* password: use `terraform output jupyter_password` to fetch the password value

4. Once logged in, choose the `CPU` preset. Go to File -> Open From URL & upload the notebook `rag-kaggle-ray-sql.ipynb` from `https://raw.githubusercontent.com/GoogleCloudPlatform/ai-on-gke/main/applications/rag/example_notebooks/rag-kaggle-ray-sql-latest.ipynb`. This path can also be found by going to the [notebook location](https://github.com/GoogleCloudPlatform/ai-on-gke/blob/main/applications/rag/example_notebooks/rag-kaggle-ray-sql-latest.ipynb) and selecting `Raw`.
2. Login to Jupyterhub:
* IAP disabled: Use placeholder credentials:
* username: user
* password: use `terraform output jupyter_password` to fetch the password value
* IAP enabled: Login with your Google credentials.

3. Once logged in, choose the `CPU` preset. Go to File -> Open From URL & upload the notebook `rag-kaggle-ray-sql.ipynb` from `https://raw.githubusercontent.com/GoogleCloudPlatform/ai-on-gke/main/applications/rag/example_notebooks/rag-kaggle-ray-sql-latest.ipynb`. This path can also be found by going to the [notebook location](https://github.com/GoogleCloudPlatform/ai-on-gke/blob/main/applications/rag/example_notebooks/rag-kaggle-ray-sql-latest.ipynb) and selecting `Raw`.

4. Create a Kaggle account and navigate to https://www.kaggle.com/settings/account and generate an API token. See https://www.kaggle.com/docs/api#authentication how to create one from https://kaggle.com/settings. This token is used in the notebook to access the [Google Maps Restaurant Reviews dataset](https://www.kaggle.com/datasets/denizbilginn/google-maps-restaurant-reviews)

5. Replace the variables in the 3rd cell with the following to access the database:
* `INSTANCE_CONNECTION_NAME`: `<project_id>:<region>:pgvector-instance`
* `DB_USER`: `rag-user-notebook`
* `DB_PASS`: password from step 1
5. Replace the variables in the 1st cell with your Kaggle credentials (can be found in the `kaggle.json` file created by Step 4):
* `KAGGLE_USERNAME`
* `KAGGLE_KEY`

6. Create a Kaggle account and navigate to https://www.kaggle.com/settings/account and generate an API token. See https://www.kaggle.com/docs/api#authentication how to create one from https://kaggle.com/settings. This token is used in the notebook to access the [Google Maps Restaurant Reviews dataset](https://www.kaggle.com/datasets/denizbilginn/google-maps-restaurant-reviews)
6. Run all the cells in the notebook. This generates vector embeddings for the input dataset (`denizbilginn/google-maps-restaurant-reviews`) and stores them in the `pgvector-instance` via a Ray job.
* When the last cell says the job has succeeded (eg: `Job 'raysubmit_APungAw6TyB55qxk' succeeded`), the vector embeddings have been generated and we can launch the frontend chat interface.
* Ray may take several minutes to create the runtime environment. During this time, the job will appear to be missing (e.g. `Status message: Job has not started yet`).

8. Replace the kaggle username and api token in 2nd cell with your credentials (can be found in the `kaggle.json` file created by Step 6):
* `os.environ['KAGGLE_USERNAME']`
* `os.environ['KAGGLE_KEY']`
### Launch the Frontend Chat Interface

1. Setup port forwarding for the frontend [TBD: Replace with IAP]: `kubectl port-forward service/rag-frontend -n ${NAMESPACE:?} 8080:8080 &`
9. Run all the cells in the notebook. This will generate vector embeddings for the input dataset (`denizbilginn/google-maps-restaurant-reviews`) and store them in the `pgvector-instance` via a Ray job.
* If the Ray job has FAILED, re-run the cell.
* When the Ray job has SUCCEEDED, we are ready to launch the frontend chat interface.
Expand All @@ -181,10 +179,10 @@ This step generates the vector embeddings for your input dataset. Currently, the

#### With IAP Enabled
1. Verify that IAP is enabled on Google Cloud Platform (GCP) for your application. If you encounter any errors, try re-enabling IAP.
2. Verify that you have the role `IAP-secured Web App User` assigned to your user account. This role is necessary to access the application through IAP.
2. From [Google Cloud Platform IAP](https://pantheon.corp.google.com/security/iap), check if the allowlisted user has role `IAP-secured Web App User`. This role is necessary to access the application through IAP.
3. Verify the domain is active using command:
`kubectl get managedcertificates frontend-managed-cert -n rag --output jsonpath='{.status.domainStatus[0].status}'`
3. Read terraform output `frontend_uri` or use the following command to find the domain created by IAP for accessing your service:
3. Read terraform output: `terraform output frontend_uri` or use the following command to find the domain created by IAP for accessing your service:
`kubectl get managedcertificates frontend-managed-cert -n $NAMESPACE --output jsonpath='{.status.domainStatus[0].domain}'`
4. Open your browser and navigate to the domain you retrieved in the previous step to start chatting!

Expand Down
42 changes: 27 additions & 15 deletions applications/rag/example_notebooks/rag-kaggle-ray-sql-latest.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,20 @@
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"execution_count": null,
"id": "00b1aff4",
"metadata": {},
"outputs": [],
"source": [
"# Replace these with your settings\n",
"# Navigate to https://www.kaggle.com/settings/account and generate an API token to be used to setup the env variable. See https://www.kaggle.com/docs/api#authentication how to create one.\n",
"KAGGLE_USERNAME = \"<username>\"\n",
"KAGGLE_KEY = \"<token>\"\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a814e91b-3afe-4c28-a3d6-fe087c7af552",
"metadata": {},
"outputs": [],
Expand All @@ -13,15 +26,14 @@
},
{
"cell_type": "code",
"execution_count": 2,
"execution_count": null,
"id": "1e26faef-9e2e-4793-b8af-0e18470b482d",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"# navigate to https://www.kaggle.com/settings/account and generate an API token to be used to setup the env variable. See https://www.kaggle.com/docs/api#authentication how to create one.\n",
"os.environ['KAGGLE_USERNAME'] = \"<username>\"\n",
"os.environ['KAGGLE_KEY'] = \"<token>\"\n",
"os.environ['KAGGLE_USERNAME'] = KAGGLE_USERNAME\n",
"os.environ['KAGGLE_KEY'] = KAGGLE_KEY\n",
"\n",
"# Download the zip file to local storage and then extract the desired contents directly to the GKE GCS CSI mounted bucket. The bucket is mounted at the \"/persist-data\" path in the jupyter pod.\n",
"!kaggle datasets download -d denizbilginn/google-maps-restaurant-reviews -p ~/data --force\n",
Expand Down Expand Up @@ -64,12 +76,18 @@
"import sqlalchemy\n",
"\n",
"# initialize parameters\n",
"INSTANCE_CONNECTION_NAME = \"<project-id>:<location>:pgvector-instance\" # Modify the project and region based on your setting\n",
"INSTANCE_CONNECTION_NAME = \"{project}:{region}:pgvector-instance\".format(project=os.environ[\"PROJECT_ID\"], region=os.environ[\"DB_REGION\"])\n",
"print(f\"Your instance connection name is: {INSTANCE_CONNECTION_NAME}\")\n",
"DB_USER = \"rag-user-notebook\" # Modify this based on your setting\n",
"DB_PASS = \"<password>\" # Modify this based on your setting\n",
"DB_NAME = \"pgvector-database\"\n",
"\n",
"db_username_file = open(\"/etc/secret-volume/username\", \"r\")\n",
"DB_USER = db_username_file.read()\n",
"db_username_file.close()\n",
"\n",
"db_password_file = open(\"/etc/secret-volume/password\", \"r\")\n",
"DB_PASS = db_password_file.read()\n",
"db_password_file.close()\n",
"\n",
"# initialize Connector object\n",
"connector = Connector()\n",
"\n",
Expand Down Expand Up @@ -189,12 +207,6 @@
" # commit transaction (SQLAlchemy v2.X.X is commit as you go)\n",
" db_conn.commit()\n",
" print(\"Created table=\", TABLE_NAME)\n",
"\n",
" # TODO: Fix workaround access grant for the frontend to access the table.\n",
" grant_access_stmt = \"GRANT SELECT on \" + TABLE_NAME + \" to \\\"rag-user\\\";\"\n",
" db_conn.execute(\n",
" sqlalchemy.text(grant_access_stmt)\n",
" )\n",
" \n",
" query_text = \"INSERT INTO \" + TABLE_NAME + \" (id, text, text_embedding) VALUES (:id, :text, :text_embedding)\"\n",
" insert_stmt = sqlalchemy.text(query_text)\n",
Expand Down Expand Up @@ -268,7 +280,7 @@
" \"cloud-sql-python-connector[pg8000]==1.7.0\",\n",
" \"SQLAlchemy==2.0.7\",\n",
" \"huggingface_hub\",\n",
" ]\n",
" ],\n",
" }\n",
")\n",
"\n",
Expand Down
10 changes: 8 additions & 2 deletions applications/rag/frontend/container/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -41,8 +41,14 @@
# initialize parameters
INFERENCE_ENDPOINT=os.environ.get('INFERENCE_ENDPOINT', '127.0.0.1:8081')
INSTANCE_CONNECTION_NAME = os.environ.get('INSTANCE_CONNECTION_NAME', '')
DB_USER = os.environ.get('DB_USER', '')
DB_PASS = os.environ.get('DB_PASSWORD', '')

db_username_file = open("/etc/secret-volume/username", "r")
DB_USER = db_username_file.read()
db_username_file.close()

db_password_file = open("/etc/secret-volume/password", "r")
DB_PASS = db_password_file.read()
db_password_file.close()

db = None
filter_names = ['DlpFilter', 'WebRiskFilter']
Expand Down
2 changes: 1 addition & 1 deletion applications/rag/frontend/container/rai/dlp_filter.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@

import os
import google.cloud.dlp
from .retry import retry
from . import retry

# Convert the project id into a full resource id.
parent = os.environ.get('PROJECT_ID', 'NULL')
Expand Down
2 changes: 1 addition & 1 deletion applications/rag/frontend/container/rai/nlp_filter.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@

import os
import google.cloud.language as language
from .retry import retry
from . import retry

# Convert the project id into a full resource id.
parent = os.environ.get('PROJECT_ID', 'NULL')
Expand Down
45 changes: 14 additions & 31 deletions applications/rag/frontend/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -99,13 +99,19 @@ resource "kubernetes_deployment" "rag_frontend_deployment" {
spec {
service_account_name = var.google_service_account
container {
image = "us-central1-docker.pkg.dev/ai-on-gke/rag-on-gke/frontend@sha256:e2dd85e92f42e3684455a316dee5f98f61f1f3fba80b9368bd6f48d5e2e3475e"
image = "us-central1-docker.pkg.dev/ai-on-gke/rag-on-gke/frontend@sha256:3d3b03e4bc6c8fe218105bd69cc6f9cfafb18fc4b1bbb81f5c46f2598b5d5f10"
name = "rag-frontend"

port {
container_port = 8080
}

volume_mount {
name = "secret-volume"
mount_path = "/etc/secret-volume"
read_only = true
}

env {
name = "PROJECT_ID"
value = "projects/${var.project_id}"
Expand All @@ -126,36 +132,6 @@ resource "kubernetes_deployment" "rag_frontend_deployment" {
value = var.dataset_embeddings_table_name
}

env {
name = "DB_USER"
value_from {
secret_key_ref {
name = var.db_secret_name
key = "username"
}
}
}

env {
name = "DB_PASSWORD"
value_from {
secret_key_ref {
name = var.db_secret_name
key = "password"
}
}
}

env {
name = "DB_NAME"
value_from {
secret_key_ref {
name = var.db_secret_name
key = "database"
}
}
}

resources {
limits = {
cpu = "3"
Expand All @@ -170,6 +146,13 @@ resource "kubernetes_deployment" "rag_frontend_deployment" {
}
}

volume {
secret {
secret_name = var.db_secret_name
}
name = "secret-volume"
}

container {
image = "gcr.io/cloud-sql-connectors/cloud-sql-proxy:2.8.0"
name = "cloud-sql-proxy"
Expand Down
8 changes: 1 addition & 7 deletions applications/rag/frontend/variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -37,13 +37,7 @@ variable "cloudsql_instance" {

variable "db_secret_name" {
type = string
description = "CloudSQL user"
}

variable "db_secret_namespace" {
type = string
description = "CloudSQL password"
default = "rag"
description = "CloudSQL user credentials"
}

variable "dataset_embeddings_table_name" {
Expand Down
4 changes: 3 additions & 1 deletion applications/rag/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -120,6 +120,7 @@ module "cloudsql" {
project_id = var.project_id
instance_name = var.cloudsql_instance
namespace = var.kubernetes_namespace
region = var.cloudsql_instance_region
depends_on = [module.namespace]
}

Expand Down Expand Up @@ -179,6 +180,8 @@ module "kuberay-cluster" {
enable_tpu = local.enable_tpu
autopilot_cluster = local.enable_autopilot
google_service_account = var.ray_service_account
db_secret_name = module.cloudsql.db_secret_name
db_region = var.cloudsql_instance_region
grafana_host = module.kuberay-monitoring.grafana_uri
depends_on = [module.kuberay-operator]
}
Expand Down Expand Up @@ -212,7 +215,6 @@ module "frontend" {
inference_service_endpoint = module.inference-server.inference_service_endpoint
cloudsql_instance = module.cloudsql.instance
db_secret_name = module.cloudsql.db_secret_name
db_secret_namespace = module.cloudsql.db_secret_namespace
dataset_embeddings_table_name = var.dataset_embeddings_table_name

# IAP Auth parameters
Expand Down
6 changes: 6 additions & 0 deletions applications/rag/variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -270,6 +270,12 @@ variable "cloudsql_instance" {
default = "pgvector-instance"
}

variable "cloudsql_instance_region" {
type = string
description = "GCP region for CloudSQL instance"
default = "us-central1"
}

variable "cpu_pools" {
type = list(object({
name = string
Expand Down
3 changes: 2 additions & 1 deletion applications/rag/workloads.tfvars
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,8 @@ kubernetes_namespace = "rag"
create_gcs_bucket = true
gcs_bucket = "rag-data-xyzu" # Choose a globally unique bucket name.

cloudsql_instance = "pgvector-instance"
cloudsql_instance = "pgvector-instance"
cloudsql_instance_region = "us-central1"
## Service accounts
# Creates a google service account & k8s service account & configures workload identity with appropriate permissions.
# Set to false & update the variable `ray_service_account` to use an existing IAM service account.
Expand Down
Loading

0 comments on commit dc323d2

Please sign in to comment.