diff --git a/CHANGELOG.md b/CHANGELOG.md index b3bb60c..6296b12 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -5,8 +5,10 @@ NOTES: BREAKING CHANGES: ENHANCEMENTS: +* resource/hopsworksai_cluster: Add support for `gcp_attributes` FEATURES: +* **New Data Source**: `hopsworksai_gcp_service_account_custom_role_permissions` BUG FIXES: diff --git a/docs/data-sources/cluster.md b/docs/data-sources/cluster.md index 51502e2..982a229 100644 --- a/docs/data-sources/cluster.md +++ b/docs/data-sources/cluster.md @@ -38,6 +38,7 @@ data "hopsworksai_clusters" "cluster" { - `creation_date` (String) The creation date of the cluster. The date is represented in RFC3339 format. - `custom_hosted_zone` (String) Override the default cloud.hopsworks.ai Hosted Zone. This option is available only to users with necessary privileges. - `deactivate_hopsworksai_log_collection` (Boolean) Allow Hopsworks.ai to collect services logs to help diagnose issues with the cluster. By deactivating this option, you will not be able to get full support from our teams. +- `gcp_attributes` (List of Object) The configurations required to run the cluster on Google GCP. (see [below for nested schema](#nestedatt--gcp_attributes)) - `head` (List of Object) The configurations of the head node of the cluster. (see [below for nested schema](#nestedatt--head)) - `id` (String) The ID of this resource. - `init_script` (String) A bash script that will run on all nodes during their initialization (must start with #!/usr/bin/env bash) @@ -195,6 +196,46 @@ Read-Only: + +### Nested Schema for `gcp_attributes` + +Read-Only: + +- `bucket` (List of Object) (see [below for nested schema](#nestedobjatt--gcp_attributes--bucket)) +- `disk_encryption` (List of Object) (see [below for nested schema](#nestedobjatt--gcp_attributes--disk_encryption)) +- `gke_cluster_name` (String) +- `network` (List of Object) (see [below for nested schema](#nestedobjatt--gcp_attributes--network)) +- `project_id` (String) +- `region` (String) +- `service_account_email` (String) +- `zone` (String) + + +### Nested Schema for `gcp_attributes.bucket` + +Read-Only: + +- `name` (String) + + + +### Nested Schema for `gcp_attributes.disk_encryption` + +Read-Only: + +- `customer_managed_encryption_key` (String) + + + +### Nested Schema for `gcp_attributes.network` + +Read-Only: + +- `network_name` (String) +- `subnetwork_name` (String) + + + ### Nested Schema for `head` diff --git a/docs/data-sources/clusters.md b/docs/data-sources/clusters.md index da014a3..48f333e 100644 --- a/docs/data-sources/clusters.md +++ b/docs/data-sources/clusters.md @@ -63,6 +63,7 @@ Read-Only: - `creation_date` (String) - `custom_hosted_zone` (String) - `deactivate_hopsworksai_log_collection` (Boolean) +- `gcp_attributes` (List of Object) (see [below for nested schema](#nestedobjatt--clusters--gcp_attributes)) - `head` (List of Object) (see [below for nested schema](#nestedobjatt--clusters--head)) - `init_script` (String) - `issue_lets_encrypt_certificate` (Boolean) @@ -219,6 +220,46 @@ Read-Only: + +### Nested Schema for `clusters.gcp_attributes` + +Read-Only: + +- `bucket` (List of Object) (see [below for nested schema](#nestedobjatt--clusters--gcp_attributes--bucket)) +- `disk_encryption` (List of Object) (see [below for nested schema](#nestedobjatt--clusters--gcp_attributes--disk_encryption)) +- `gke_cluster_name` (String) +- `network` (List of Object) (see [below for nested schema](#nestedobjatt--clusters--gcp_attributes--network)) +- `project_id` (String) +- `region` (String) +- `service_account_email` (String) +- `zone` (String) + + +### Nested Schema for `clusters.gcp_attributes.bucket` + +Read-Only: + +- `name` (String) + + + +### Nested Schema for `clusters.gcp_attributes.disk_encryption` + +Read-Only: + +- `customer_managed_encryption_key` (String) + + + +### Nested Schema for `clusters.gcp_attributes.network` + +Read-Only: + +- `network_name` (String) +- `subnetwork_name` (String) + + + ### Nested Schema for `clusters.head` diff --git a/docs/data-sources/gcp_service_account_custom_role_permissions.md b/docs/data-sources/gcp_service_account_custom_role_permissions.md new file mode 100644 index 0000000..d028ddd --- /dev/null +++ b/docs/data-sources/gcp_service_account_custom_role_permissions.md @@ -0,0 +1,27 @@ +--- +# generated by https://github.com/hashicorp/terraform-plugin-docs +page_title: "hopsworksai_gcp_service_account_custom_role_permissions Data Source - terraform-provider-hopsworksai" +subcategory: "" +description: |- + Use this data source to get the GCP service account custom role permissions needed by Hopsworks.ai +--- + +# hopsworksai_gcp_service_account_custom_role_permissions (Data Source) + +Use this data source to get the GCP service account custom role permissions needed by Hopsworks.ai + + + + +## Schema + +### Optional + +- `enable_artifact_registry` (Boolean) Add permissions required to enable access to the artifact registry Defaults to `true`. +- `enable_backup` (Boolean) Add permissions required to allow creating backups of your clusters. Defaults to `true`. +- `enable_storage` (Boolean) Add permissions required to allow Hopsworks clusters to read and write from and to your google storage bucket. Defaults to `true`. + +### Read-Only + +- `id` (String) The ID of this resource. +- `permissions` (List of String) The list of permissions. diff --git a/docs/data-sources/instance_type.md b/docs/data-sources/instance_type.md index 311acd1..0a9d180 100644 --- a/docs/data-sources/instance_type.md +++ b/docs/data-sources/instance_type.md @@ -37,7 +37,7 @@ data "hopsworksai_instance_type" "supported_type" { - `cloud_provider` (String) The cloud provider where you plan to create your cluster. - `node_type` (String) The node type that you want to get its smallest instance type. It has to be one of these types (head, worker, rondb_management, rondb_data, rondb_mysql, rondb_api). -- `region` (String) The region/location where you plan to create your cluster. +- `region` (String) The region/location/zone where you plan to create your cluster. In case of GCP you should use the zone name. ### Optional diff --git a/docs/data-sources/instance_types.md b/docs/data-sources/instance_types.md index 24beb61..86613c3 100644 --- a/docs/data-sources/instance_types.md +++ b/docs/data-sources/instance_types.md @@ -28,7 +28,7 @@ data "hopsworksai_instance_types" "supported_worker_types" { - `cloud_provider` (String) The cloud provider where you plan to create your cluster. - `node_type` (String) The node type that you want to get its supported instance types. -- `region` (String) The region/location where you plan to create your cluster. +- `region` (String) The region/location/zone where you plan to create your cluster. In case of GCP you should use the zone name. ### Read-Only diff --git a/docs/index.md b/docs/index.md index 3a9e10e..a207e59 100644 --- a/docs/index.md +++ b/docs/index.md @@ -12,6 +12,7 @@ The Hopsworksai terraform provider is used to interact with [Hopsworks.ai](https If you are new to Hopsworks, then first you need to create an account on [Hopsworks.ai](https://managed.hopsworks.ai), and then you can follow one of the getting started guides to connect either your AWS account or Azure account to create your own Hopsworks clusters. * [Getting Started with AWS](https://docs.hopsworks.ai/latest/setup_installation/aws/getting_started/) * [Getting Started with Azure](https://docs.hopsworks.ai/latest/setup_installation/azure/getting_started/) + * [Getting Started with GCP](https://docs.hopsworks.ai/latest/setup_installation/gcp/getting_started/) -> A Hopsworks API Key is required to allow the provider to manage clusters on Hopsworks.ai on your behalf. To create an API Key, follow [this guide](https://docs.hopsworks.ai/latest/setup_installation/common/api_key). @@ -22,7 +23,7 @@ In the following sections, we show two usage examples to create Hopsworks cluste Hopsworks.ai deploys Hopsworks clusters to your AWS account using the permissions provided during [account setup](https://docs.hopsworks.ai/latest/setup_installation/aws/getting_started/#step-1-connecting-your-aws-account). To create a Hopsworks cluster, you will need to create an empty S3 bucket, an ssh key, and an instance profile with the required [Hopsworks permissions](https://docs.hopsworks.ai/latest/setup_installation/aws/getting_started/#step-2-creating-instance-profile). -If you have already created these 3 resources, you can skip the first step in the following terraform example and instead fill the corresponding attributes in Step 2 (*bucket_name*, *ssh_key*, *instance_profile_arn*) with your configuration. +If you have already created these 3 resources, you can skip the first step in the following terraform example and instead fill the corresponding attributes in Step 2 (*bucket/name*, *ssh_key*, *instance_profile_arn*) with your configuration. Otherwise, you need to setup the credentials for your AWS account locally as described [here](https://registry.terraform.io/providers/hashicorp/aws/latest/docs), then you can run the following terraform example which creates the required AWS resources and a Hopsworks cluster. ```terraform @@ -165,7 +166,7 @@ module "azure" { version = "2.3.0" } -# Step 2: create a cluster with no workers +# Step 2: create a cluster with 1 worker data "hopsworksai_instance_type" "head" { cloud_provider = "AZURE" @@ -237,6 +238,152 @@ output "hopsworks_cluster_url" { } ``` +## GCP Example Usage + +Similar to AWS and AZURE, Hopsworks.ai deploys Hopsworks clusters to your GCP project using the permissions provided during [account setup](https://docs.hopsworks.ai/latest/setup_installation/gcp/getting_started/#step-1-connecting-your-gcp-account). +To create a Hopsworks cluster, you will need to create a storage bucket and a service account with the required [Hopsworks permissions](https://docs.hopsworks.ai/latest/setup_installation/gcp/getting_started/#step-3-creating-a-service-account-for-your-cluster-instances) +If you have already created these 2 resources, you can skip the first step in the following terraform example and instead fill the corresponding attributes in Step 2 (*service_account_email*, *bucket/name*) with your configuration. +Otherwise, you need to setup the credentials for your Google account locally as described [here](https://registry.terraform.io/providers/hashicorp/google/latest/docs), then you can run the following terraform example which creates the required Google resources and a Hopsworks cluster. + + +```terraform +terraform { + required_version = ">= 0.14.0" + + required_providers { + google = { + source = "hashicorp/google" + version = "5.13.0" + } + hopsworksai = { + source = "logicalclocks/hopsworksai" + } + } +} + + +variable "region" { + type = string + default = "europe-north1" +} + +variable "project" { + type = string +} + +provider "google" { + region = var.region + project = var.project +} + +provider "hopsworksai" { + # Highly recommended to use the HOPSWORKSAI_API_KEY environment variable instead + api_key = "YOUR HOPSWORKS API KEY" +} + + +# Step 1: Create required google resources, a storage bucket and an service account with the required hopsworks permissions +data "hopsworksai_gcp_service_account_custom_role_permissions" "service_account" { + +} + +resource "google_project_iam_custom_role" "service_account_role" { + role_id = "tf.HopsworksAIInstances" + title = "Hopsworks AI Instances" + description = "Role that allows Hopsworks AI Instances to access resources" + permissions = data.hopsworksai_gcp_service_account_custom_role_permissions.service_account.permissions +} + +resource "google_service_account" "service_account" { + account_id = "tf-hopsworks-ai-instances" + display_name = "Hopsworks AI instances" + description = "Service account for Hopsworks AI instances" +} + +resource "google_project_iam_binding" "service_account_role_binding" { + project = var.project + role = google_project_iam_custom_role.service_account_role.id + + members = [ + google_service_account.service_account.member + ] +} + +resource "google_storage_bucket" "bucket" { + name = "tf-hopsworks-bucket" + location = var.region + force_destroy = true +} + +# Step 2: create a cluster with 1 worker + +data "google_compute_zones" "available" { + region = var.region +} + +locals { + zone = data.google_compute_zones.available.names.0 +} + +data "hopsworksai_instance_type" "head" { + cloud_provider = "GCP" + node_type = "head" + region = local.zone +} + +data "hopsworksai_instance_type" "rondb_data" { + cloud_provider = "GCP" + node_type = "rondb_data" + region = local.zone +} + +data "hopsworksai_instance_type" "small_worker" { + cloud_provider = "GCP" + node_type = "worker" + region = local.zone + min_memory_gb = 16 + min_cpus = 4 +} + +resource "hopsworksai_cluster" "cluster" { + name = "tf-cluster" + + head { + instance_type = data.hopsworksai_instance_type.head.id + } + + workers { + instance_type = data.hopsworksai_instance_type.smallest_worker.id + count = 1 + } + + gcp_attributes { + project_id = var.project + region = var.region + zone = local.zone + service_account_email = google_service_account.service_account.email + bucket { + name = google_storage_bucket.bucket.name + } + } + + rondb { + single_node { + instance_type = data.hopsworksai_instance_type.rondb_data.id + } + } + + open_ports { + ssh = true + } +} + +# Outputs the url of the newly created cluster +output "hopsworks_cluster_url" { + value = hopsworksai_cluster.cluster.url +} +``` + ## Schema diff --git a/docs/resources/cluster.md b/docs/resources/cluster.md index e168c72..50bbb49 100644 --- a/docs/resources/cluster.md +++ b/docs/resources/cluster.md @@ -114,6 +114,7 @@ resource "hopsworksai_cluster" "cluster" { - `collect_logs` (Boolean) Push services' logs to AWS cloud watch. Defaults to `false`. - `custom_hosted_zone` (String) Override the default cloud.hopsworks.ai Hosted Zone. This option is available only to users with necessary privileges. - `deactivate_hopsworksai_log_collection` (Boolean) Allow Hopsworks.ai to collect services logs to help diagnose issues with the cluster. By deactivating this option, you will not be able to get full support from our teams. Defaults to `false`. +- `gcp_attributes` (Block List, Max: 1) The configurations required to run the cluster on Google GCP. (see [below for nested schema](#nestedblock--gcp_attributes)) - `init_script` (String) A bash script that will run on all nodes during their initialization (must start with #!/usr/bin/env bash) - `issue_lets_encrypt_certificate` (Boolean) Enable or disable issuing let's encrypt certificates. This can be used to disable issuing certificates if port 80 can not be open. Defaults to `true`. - `managed_users` (Boolean) Enable or disable Hopsworks.ai to manage your users. Defaults to `true`. @@ -445,6 +446,49 @@ Optional: + +### Nested Schema for `gcp_attributes` + +Required: + +- `project_id` (String) The GCP project where the cluster will be created. +- `region` (String) The GCP region where the cluster will be created. +- `service_account_email` (String) The service account email address that the cluster will be started with. +- `zone` (String) The GCP region where the cluster will be created. + +Optional: + +- `bucket` (Block List, Max: 1) The bucket configurations. (see [below for nested schema](#nestedblock--gcp_attributes--bucket)) +- `disk_encryption` (Block List, Max: 1) The disk encryption configuration. (see [below for nested schema](#nestedblock--gcp_attributes--disk_encryption)) +- `gke_cluster_name` (String) The name of the Google GKE cluster. +- `network` (Block List, Max: 1) The network configurations. (see [below for nested schema](#nestedblock--gcp_attributes--network)) + + +### Nested Schema for `gcp_attributes.bucket` + +Required: + +- `name` (String) The name of the GCP storage bucket that the cluster will use to store data in. + + + +### Nested Schema for `gcp_attributes.disk_encryption` + +Optional: + +- `customer_managed_encryption_key` (String) Specify a customer-managed encryption key to be used for encryption of local storage. The key has to use the format: projects/PROJECT_ID/locations/REGION/keyRings/KEY_RING/cryptoKeys/KEY. + + + +### Nested Schema for `gcp_attributes.network` + +Required: + +- `network_name` (String) The network name. +- `subnetwork_name` (String) The subnetwork name. + + + ### Nested Schema for `open_ports` diff --git a/docs/resources/cluster_from_backup.md b/docs/resources/cluster_from_backup.md index 196b419..1660cf0 100644 --- a/docs/resources/cluster_from_backup.md +++ b/docs/resources/cluster_from_backup.md @@ -30,6 +30,7 @@ resource "hopsworksai_cluster_from_backup" "cluster" { - `autoscale` (Block List, Max: 1) Setup auto scaling. (see [below for nested schema](#nestedblock--autoscale)) - `aws_attributes` (Block List, Max: 1) The configurations required to run the cluster on Amazon AWS. (see [below for nested schema](#nestedblock--aws_attributes)) - `azure_attributes` (Block List, Max: 1) The configurations required to run the cluster on Microsoft Azure. (see [below for nested schema](#nestedblock--azure_attributes)) +- `gcp_attributes` (Block List, Max: 1) The configurations required to run the cluster on Google GCP. (see [below for nested schema](#nestedblock--gcp_attributes)) - `name` (String) The name of the cluster, must be unique. - `open_ports` (Block List, Max: 1) Open the required ports to communicate with one of the Hopsworks services. (see [below for nested schema](#nestedblock--open_ports)) - `ssh_key` (String) The ssh key name that will be attached to this cluster. @@ -215,6 +216,49 @@ Read-Only: + +### Nested Schema for `gcp_attributes` + +Optional: + +- `network` (Block List, Max: 1) The network configurations. (see [below for nested schema](#nestedblock--gcp_attributes--network)) +- `service_account_email` (String) The service account email address that the cluster will be started with. + +Read-Only: + +- `bucket` (List of Object) The bucket configurations. (see [below for nested schema](#nestedatt--gcp_attributes--bucket)) +- `disk_encryption` (List of Object) The disk encryption configuration. (see [below for nested schema](#nestedatt--gcp_attributes--disk_encryption)) +- `gke_cluster_name` (String) The name of the Google GKE cluster. +- `project_id` (String) The GCP project where the cluster will be created. +- `region` (String) The GCP region where the cluster will be created. +- `zone` (String) The GCP region where the cluster will be created. + + +### Nested Schema for `gcp_attributes.network` + +Required: + +- `network_name` (String) The network name. +- `subnetwork_name` (String) The subnetwork name. + + + +### Nested Schema for `gcp_attributes.bucket` + +Read-Only: + +- `name` (String) + + + +### Nested Schema for `gcp_attributes.disk_encryption` + +Read-Only: + +- `customer_managed_encryption_key` (String) + + + ### Nested Schema for `open_ports` diff --git a/examples/complete/gcp/README.md b/examples/complete/gcp/README.md new file mode 100644 index 0000000..c8eab95 --- /dev/null +++ b/examples/complete/gcp/README.md @@ -0,0 +1,6 @@ +# Hopsworks.ai GCP Examples + +In this directory, we have the following examples: + +1. [Cluster with basic configuration using 2 workers](./basic) +4. [Cluster with GKE support](./gke) \ No newline at end of file diff --git a/examples/complete/gcp/basic/README.md b/examples/complete/gcp/basic/README.md new file mode 100644 index 0000000..1dfb69a --- /dev/null +++ b/examples/complete/gcp/basic/README.md @@ -0,0 +1,116 @@ +# Hopsworks cluster with 2 workers + +In this example, we create a Hopsworks cluster with 2 workers. We configure one of the workers to use an instance type with at least 4 vCPUs and 16 GB of memory while using at least 32 GB of memory for the other worker. + +## Configure RonDB + +You can configure RonDB nodes instead of relying on the default configurations, for instance in the following example, we increased the number of data nodes to 4 and we used an instance type with at least 8 CPUs and 16 GB of memory. + +```hcl +data "hopsworksai_instance_type" "smallest_rondb_datanode" { + cloud_provider = "GCP" + node_type = "rondb_data" + region = local.zone + min_memory_gb = 16 + min_cpus = 8 +} + +resource "hopsworksai_cluster" "cluster" { + # all the other configurations are omitted for clarity + + rondb { + data_nodes { + instance_type = data.hopsworksai_instance_type.smallest_rondb_datanode.id + disk_size = 512 + count = 4 + } + } +} +``` + +## How to run the example +First ensure that your GCP account credentials are setup correctly by running the following command + +```bash +gcloud init +``` + +Then, run the following commands. Replace the placeholder with your Hopsworks API Key. The cluster will be created in europe-north1 region by default, however, you can configure which region to use by setting the variable region when applying the changes `-var="region=YOUR_REGION" -var="project=YOUR_PROJECT_ID"` + +```bash +export HOPSWORKSAI_API_KEY= +terraform init +terraform apply +``` + +## Update workers + +You can always update the worker configurations after creation, for example you can increase the number of small workers to use 2 instead of 1 as follows: + +> **Notice** that you need to run `terraform apply` after updating your configuration for your changes to take place. + +```hcl +resource "hopsworksai_cluster" "cluster" { + # all the other configurations are omitted for clarity + + workers { + instance_type = data.hopsworksai_instance_type.small_worker.id + disk_size = 256 + count = 2 + } + + workers { + instance_type = data.hopsworksai_instance_type.large_worker.id + disk_size = 512 + count = 1 + } + +} +``` + +Also, you can remove the large worker if you want by removing the large workers block as follows: + +```hcl +resource "hopsworksai_cluster" "cluster" { + # all the other configurations are omitted for clarity + + workers { + instance_type = data.hopsworksai_instance_type.small_worker.id + disk_size = 256 + count = 2 + } + +} +``` + +You can add a new different worker type for example another worker with at least 16 cpu cores as follows: + +```hcl +data "hopsworksai_instance_type" "my_worker" { + cloud_provider = "GCP" + node_type = "worker" + region = local.zone + min_cpus = 16 +} + +resource "hopsworksai_cluster" "cluster" { + # all the other configurations are omitted for clarity + + workers { + instance_type = data.hopsworksai_instance_type.small_worker.id + disk_size = 256 + count = 2 + } + + workers { + instance_type = data.hopsworksai_instance_type.my_worker.id + disk_size = 512 + count = 1 + } + +} +``` + +## Terminate the cluster + +You can run `terraform destroy` to delete the cluster and all the other required cloud resources created in this example. \ No newline at end of file diff --git a/examples/complete/gcp/basic/main.tf b/examples/complete/gcp/basic/main.tf new file mode 100644 index 0000000..ab8adce --- /dev/null +++ b/examples/complete/gcp/basic/main.tf @@ -0,0 +1,149 @@ +provider "google" { + region = var.region + project = var.project +} + +provider "hopsworksai" { +} + +provider "time" { + +} + +# Create required google resources, a storage bucket and an service account with the required hopsworks permissions +data "hopsworksai_gcp_service_account_custom_role_permissions" "service_account" { + +} + +resource "google_project_iam_custom_role" "service_account_role" { + role_id = "tf.HopsworksAIInstances" + title = "Hopsworks AI Instances" + description = "Role that allows Hopsworks AI Instances to access resources" + permissions = data.hopsworksai_gcp_service_account_custom_role_permissions.service_account.permissions +} + +resource "google_service_account" "service_account" { + account_id = "tf-hopsworks-ai-instances" + display_name = "Hopsworks AI instances" + description = "Service account for Hopsworks AI instances" +} + +resource "google_project_iam_binding" "service_account_role_binding" { + project = var.project + role = google_project_iam_custom_role.service_account_role.id + + members = [ + google_service_account.service_account.member + ] +} + +resource "google_storage_bucket" "bucket" { + name = "tf-hopsworks-bucket" + location = var.region + force_destroy = true +} + +resource "time_sleep" "wait_60_seconds" { + depends_on = [google_project_iam_binding.service_account_role_binding] + + create_duration = "60s" +} + +# Create a simple cluster with two workers with two different configuration +data "google_compute_zones" "available" { + region = var.region +} + +locals { + zone = data.google_compute_zones.available.names.0 +} + +data "hopsworksai_instance_type" "head" { + cloud_provider = "GCP" + node_type = "head" + region = local.zone +} + +data "hopsworksai_instance_type" "rondb_mgm" { + cloud_provider = "GCP" + node_type = "rondb_management" + region = local.zone +} + +data "hopsworksai_instance_type" "rondb_data" { + cloud_provider = "GCP" + node_type = "rondb_data" + region = local.zone +} + +data "hopsworksai_instance_type" "rondb_mysql" { + cloud_provider = "GCP" + node_type = "rondb_mysql" + region = local.zone +} + +data "hopsworksai_instance_type" "small_worker" { + cloud_provider = "GCP" + node_type = "worker" + region = local.zone + min_memory_gb = 16 + min_cpus = 4 +} + +data "hopsworksai_instance_type" "large_worker" { + cloud_provider = "GCP" + node_type = "worker" + region = local.zone + min_memory_gb = 32 + min_cpus = 4 +} + +resource "hopsworksai_cluster" "cluster" { + name = "tf-cluster" + + head { + instance_type = data.hopsworksai_instance_type.head.id + } + + workers { + instance_type = data.hopsworksai_instance_type.small_worker.id + disk_size = 256 + count = 1 + } + + workers { + instance_type = data.hopsworksai_instance_type.large_worker.id + disk_size = 512 + count = 1 + } + + gcp_attributes { + project_id = var.project + region = var.region + zone = local.zone + service_account_email = google_service_account.service_account.email + bucket { + name = google_storage_bucket.bucket.name + } + } + + rondb { + management_nodes { + instance_type = data.hopsworksai_instance_type.rondb_mgm.id + } + data_nodes { + instance_type = data.hopsworksai_instance_type.rondb_data.id + } + mysql_nodes { + instance_type = data.hopsworksai_instance_type.rondb_mysql.id + } + } + + open_ports { + ssh = true + } + + # waiting for 60 seconds after service account permissions has been granted + # to avoid permissions validation failure on hopsworks when creating the cluster + depends_on = [time_sleep.wait_60_seconds] +} diff --git a/examples/complete/gcp/basic/outputs.tf b/examples/complete/gcp/basic/outputs.tf new file mode 100644 index 0000000..445cb97 --- /dev/null +++ b/examples/complete/gcp/basic/outputs.tf @@ -0,0 +1,3 @@ +output "hopsworks_cluster_url" { + value = hopsworksai_cluster.cluster.url +} \ No newline at end of file diff --git a/examples/complete/gcp/basic/variables.tf b/examples/complete/gcp/basic/variables.tf new file mode 100644 index 0000000..02255ca --- /dev/null +++ b/examples/complete/gcp/basic/variables.tf @@ -0,0 +1,8 @@ +variable "region" { + type = string + default = "europe-north1" +} + +variable "project" { + type = string +} \ No newline at end of file diff --git a/examples/complete/gcp/basic/versions.tf b/examples/complete/gcp/basic/versions.tf new file mode 100644 index 0000000..d17a3e5 --- /dev/null +++ b/examples/complete/gcp/basic/versions.tf @@ -0,0 +1,17 @@ +terraform { + required_version = ">= 0.14.0" + + required_providers { + google = { + source = "hashicorp/google" + version = "5.13.0" + } + hopsworksai = { + source = "logicalclocks/hopsworksai" + } + time = { + source = "hashicorp/time" + version = "0.10.0" + } + } +} diff --git a/examples/complete/gcp/gke/README.md b/examples/complete/gcp/gke/README.md new file mode 100644 index 0000000..673fa60 --- /dev/null +++ b/examples/complete/gcp/gke/README.md @@ -0,0 +1,6 @@ +# Hopsworks.ai GCP with GKE Examples + +In this directory, we have the following examples: + +1. [Cluster with Standard GKE](./standard) +4. [Cluster with Autopilot GKE](./autopilot) \ No newline at end of file diff --git a/examples/complete/gcp/gke/autopilot/README.md b/examples/complete/gcp/gke/autopilot/README.md new file mode 100644 index 0000000..33bf68d --- /dev/null +++ b/examples/complete/gcp/gke/autopilot/README.md @@ -0,0 +1,47 @@ +# Integrate Hopsworks cluster with Google Autopilot GKE + +In this example, we create an autopilot GKE cluster and a Hopsworks cluster that is integrated with GKE. We also create a VPC network where both GKE and Hopsworks reside ensuring that they can communicate with each other. + +## Configure RonDB + +You can configure RonDB nodes instead of relying on the default configurations, for instance in the following example, we increased the number of data nodes to 4 and we used an instance type with at least 8 CPUs and 16 GB of memory. + +```hcl +data "hopsworksai_instance_type" "smallest_rondb_datanode" { + cloud_provider = "GCP" + node_type = "rondb_data" + min_memory_gb = 16 + min_cpus = 8 +} + +resource "hopsworksai_cluster" "cluster" { + # all the other configurations are omitted for clarity + + rondb { + data_nodes { + instance_type = data.hopsworksai_instance_type.smallest_rondb_datanode.id + disk_size = 512 + count = 4 + } + } +} +``` + +## How to run the example +First ensure that your GCP account credentials are setup correctly by running the following command + +```bash +gcloud init +``` + +Then, run the following commands. Replace the placeholder with your Hopsworks API Key. The GKE and Hopsworks clusters will be created in europe-north1 region by default, however, you can configure which region to use by setting the variable region when applying the changes `-var="region=YOUR_REGION" -var="project=YOUR_PROJECT_ID"` + +```bash +export HOPSWORKSAI_API_KEY= +terraform init +terraform apply +``` + +## Terminate the cluster + +You can run `terraform destroy` to delete the cluster and all the other required cloud resources created in this example. \ No newline at end of file diff --git a/examples/complete/gcp/gke/autopilot/main.tf b/examples/complete/gcp/gke/autopilot/main.tf new file mode 100644 index 0000000..88df5c2 --- /dev/null +++ b/examples/complete/gcp/gke/autopilot/main.tf @@ -0,0 +1,239 @@ +provider "google" { + region = var.region + project = var.project +} + +provider "hopsworksai" { +} + +# Create required google resources, a storage bucket and an service account with the required hopsworks permissions +data "hopsworksai_gcp_service_account_custom_role_permissions" "service_account" { + +} + +resource "google_project_iam_custom_role" "service_account_role" { + role_id = "tf.HopsworksAIInstances" + title = "Hopsworks AI Instances" + description = "Role that allows Hopsworks AI Instances to access resources" + permissions = data.hopsworksai_gcp_service_account_custom_role_permissions.service_account.permissions +} + +resource "google_service_account" "service_account" { + account_id = "tf-hopsworks-ai-instances" + display_name = "Hopsworks AI instances" + description = "Service account for Hopsworks AI instances" +} + +resource "google_project_iam_binding" "service_account_role_binding" { + project = var.project + role = google_project_iam_custom_role.service_account_role.id + + members = [ + google_service_account.service_account.member + ] +} + +resource "google_storage_bucket" "bucket" { + name = "tf-hopsworks-bucket" + location = var.region + force_destroy = true +} + +# Attach Kuberentes developer role to the service account for cluster instance + +resource "google_project_iam_binding" "service_account_k8s_role_binding" { + project = var.project + role = "roles/container.developer" + + members = [ + google_service_account.service_account.member + ] +} + +# Create a network +data "google_compute_zones" "available" { + region = var.region +} + +locals { + zone = data.google_compute_zones.available.names.0 +} + +resource "google_compute_network" "network" { + name = "tf-hopsworks" + auto_create_subnetworks = false + mtu = 1460 +} + +resource "google_compute_subnetwork" "subnetwork" { + name = "tf-hopsworks-subnetwork" + ip_cidr_range = "10.1.0.0/24" + region = var.region + network = google_compute_network.network.id +} + +resource "google_compute_firewall" "nodetonode" { + name = "tf-hopsworks-nodetonode" + network = google_compute_network.network.name + allow { + protocol = "all" + } + direction = "INGRESS" + source_service_accounts = [google_service_account.service_account.email] + target_service_accounts = [google_service_account.service_account.email] +} + +resource "google_compute_firewall" "inbound" { + name = "tf-hopsworks-inbound" + network = google_compute_network.network.name + allow { + protocol = "tcp" + ports = ["80", "443"] + } + + direction = "INGRESS" + target_service_accounts = [google_service_account.service_account.email] + source_ranges = ["0.0.0.0/0"] +} + +# Create an autopilot GKE cluster + +resource "google_container_cluster" "cluster" { + name = "tf-gke-cluster" + enable_autopilot = true + location = var.region + network = google_compute_network.network.name + subnetwork = google_compute_subnetwork.subnetwork.name + + ip_allocation_policy { + cluster_ipv4_cidr_block = "10.124.0.0/14" + } + + deletion_protection = false +} + +resource "google_compute_firewall" "gke_traffic" { + name = "tf-hopsworks-gke-traffic" + network = google_compute_network.network.name + allow { + protocol = "all" + } + + direction = "INGRESS" + target_service_accounts = [google_service_account.service_account.email] + source_ranges = [google_container_cluster.cluster.ip_allocation_policy.0.cluster_ipv4_cidr_block] +} + +resource "google_compute_firewall" "cloud_dns" { + name = "tf-hopsworks-clouddns-traffic" + network = google_compute_network.network.name + allow { + protocol = "udp" + ports = ["53"] + } + + direction = "INGRESS" + target_service_accounts = [google_service_account.service_account.email] + source_ranges = ["35.199.192.0/19"] +} + +# Create a simple cluster with autoscale and GKE integration +data "hopsworksai_instance_type" "head" { + cloud_provider = "GCP" + node_type = "head" + region = local.zone +} + +data "hopsworksai_instance_type" "rondb_mgm" { + cloud_provider = "GCP" + node_type = "rondb_management" + region = local.zone +} + +data "hopsworksai_instance_type" "rondb_data" { + cloud_provider = "GCP" + node_type = "rondb_data" + region = local.zone +} + +data "hopsworksai_instance_type" "rondb_mysql" { + cloud_provider = "GCP" + node_type = "rondb_mysql" + region = local.zone +} + +data "hopsworksai_instance_type" "worker" { + cloud_provider = "GCP" + node_type = "worker" + region = local.zone + min_memory_gb = 16 + min_cpus = 8 +} + +resource "hopsworksai_cluster" "cluster" { + name = "tf-cluster" + + head { + instance_type = data.hopsworksai_instance_type.head.id + } + + autoscale { + non_gpu_workers { + instance_type = data.hopsworksai_instance_type.worker.id + disk_size = 256 + min_workers = 1 + max_workers = 5 + standby_workers = 0.5 + downscale_wait_time = 300 + } + } + + gcp_attributes { + project_id = var.project + region = var.region + zone = local.zone + service_account_email = google_service_account.service_account.email + bucket { + name = google_storage_bucket.bucket.name + } + network { + network_name = google_compute_network.network.name + subnetwork_name = google_compute_subnetwork.subnetwork.name + } + gke_cluster_name = google_container_cluster.cluster.name + } + + rondb { + management_nodes { + instance_type = data.hopsworksai_instance_type.rondb_mgm.id + } + data_nodes { + instance_type = data.hopsworksai_instance_type.rondb_data.id + } + mysql_nodes { + instance_type = data.hopsworksai_instance_type.rondb_mysql.id + } + } + +} + +# Configure CloudDNS to forward consul requests to Hopsworks +resource "google_dns_managed_zone" "consul" { + name = "hopsworks-consul" + dns_name = "consul." + description = "Forward .consul DNS requests to Hopsworks" + visibility = "private" + + private_visibility_config { + networks { + network_url = google_compute_network.network.id + } + } + + forwarding_config { + target_name_servers { + ipv4_address = hopsworksai_cluster.cluster.head.0.private_ip + } + } +} + diff --git a/examples/complete/gcp/gke/autopilot/outputs.tf b/examples/complete/gcp/gke/autopilot/outputs.tf new file mode 100644 index 0000000..445cb97 --- /dev/null +++ b/examples/complete/gcp/gke/autopilot/outputs.tf @@ -0,0 +1,3 @@ +output "hopsworks_cluster_url" { + value = hopsworksai_cluster.cluster.url +} \ No newline at end of file diff --git a/examples/complete/gcp/gke/autopilot/variables.tf b/examples/complete/gcp/gke/autopilot/variables.tf new file mode 100644 index 0000000..02255ca --- /dev/null +++ b/examples/complete/gcp/gke/autopilot/variables.tf @@ -0,0 +1,8 @@ +variable "region" { + type = string + default = "europe-north1" +} + +variable "project" { + type = string +} \ No newline at end of file diff --git a/examples/complete/gcp/gke/autopilot/versions.tf b/examples/complete/gcp/gke/autopilot/versions.tf new file mode 100644 index 0000000..74e4162 --- /dev/null +++ b/examples/complete/gcp/gke/autopilot/versions.tf @@ -0,0 +1,13 @@ +terraform { + required_version = ">= 0.14.0" + + required_providers { + google = { + source = "hashicorp/google" + version = "5.13.0" + } + hopsworksai = { + source = "logicalclocks/hopsworksai" + } + } +} diff --git a/examples/complete/gcp/gke/standard/README.md b/examples/complete/gcp/gke/standard/README.md new file mode 100644 index 0000000..0f9c3ec --- /dev/null +++ b/examples/complete/gcp/gke/standard/README.md @@ -0,0 +1,47 @@ +# Integrate Hopsworks cluster with Google Standard GKE + +In this example, we create a standard GKE cluster and a Hopsworks cluster that is integrated with GKE. We also create a VPC network where both GKE and Hopsworks reside ensuring that they can communicate with each other. + +## Configure RonDB + +You can configure RonDB nodes instead of relying on the default configurations, for instance in the following example, we increased the number of data nodes to 4 and we used an instance type with at least 8 CPUs and 16 GB of memory. + +```hcl +data "hopsworksai_instance_type" "smallest_rondb_datanode" { + cloud_provider = "GCP" + node_type = "rondb_data" + min_memory_gb = 16 + min_cpus = 8 +} + +resource "hopsworksai_cluster" "cluster" { + # all the other configurations are omitted for clarity + + rondb { + data_nodes { + instance_type = data.hopsworksai_instance_type.smallest_rondb_datanode.id + disk_size = 512 + count = 4 + } + } +} +``` + +## How to run the example +First ensure that your GCP account credentials are setup correctly by running the following command + +```bash +gcloud init +``` + +Then, run the following commands. Replace the placeholder with your Hopsworks API Key. The GKE and Hopsworks clusters will be created in europe-north1 region by default, however, you can configure which region to use by setting the variable region when applying the changes `-var="region=YOUR_REGION" -var="project=YOUR_PROJECT_ID"` + +```bash +export HOPSWORKSAI_API_KEY= +terraform init +terraform apply +``` + +## Terminate the cluster + +You can run `terraform destroy` to delete the cluster and all the other required cloud resources created in this example. \ No newline at end of file diff --git a/examples/complete/gcp/gke/standard/main.tf b/examples/complete/gcp/gke/standard/main.tf new file mode 100644 index 0000000..92c43ae --- /dev/null +++ b/examples/complete/gcp/gke/standard/main.tf @@ -0,0 +1,218 @@ +provider "google" { + region = var.region + project = var.project +} + +provider "hopsworksai" { + +} + +# Create required google resources, a storage bucket and an service account with the required hopsworks permissions +data "hopsworksai_gcp_service_account_custom_role_permissions" "service_account" { + +} + +resource "google_project_iam_custom_role" "service_account_role" { + role_id = "tf.HopsworksAIInstances" + title = "Hopsworks AI Instances" + description = "Role that allows Hopsworks AI Instances to access resources" + permissions = data.hopsworksai_gcp_service_account_custom_role_permissions.service_account.permissions +} + +resource "google_service_account" "service_account" { + account_id = "tf-hopsworks-ai-instances" + display_name = "Hopsworks AI instances" + description = "Service account for Hopsworks AI instances" +} + +resource "google_project_iam_binding" "service_account_role_binding" { + project = var.project + role = google_project_iam_custom_role.service_account_role.id + + members = [ + google_service_account.service_account.member + ] +} + +resource "google_storage_bucket" "bucket" { + name = "tf-hopsworks-bucket" + location = var.region + force_destroy = true +} + +# Attach Kubernetes developer role to the service account for cluster instance + +resource "google_project_iam_binding" "service_account_k8s_role_binding" { + project = var.project + role = "roles/container.developer" + + members = [ + google_service_account.service_account.member + ] +} + +# Create a network +data "google_compute_zones" "available" { + region = var.region +} + +locals { + zone = data.google_compute_zones.available.names.0 +} + +resource "google_compute_network" "network" { + name = "tf-hopsworks" + auto_create_subnetworks = false + mtu = 1460 +} + +resource "google_compute_subnetwork" "subnetwork" { + name = "tf-hopsworks-subnetwork" + ip_cidr_range = "10.1.0.0/24" + region = var.region + network = google_compute_network.network.id +} + +resource "google_compute_firewall" "nodetonode" { + name = "tf-hopsworks-nodetonode" + network = google_compute_network.network.name + allow { + protocol = "all" + } + direction = "INGRESS" + source_service_accounts = [google_service_account.service_account.email] + target_service_accounts = [google_service_account.service_account.email] +} + +resource "google_compute_firewall" "inbound" { + name = "tf-hopsworks-inbound" + network = google_compute_network.network.name + allow { + protocol = "tcp" + ports = ["80", "443"] + } + + direction = "INGRESS" + target_service_accounts = [google_service_account.service_account.email] + source_ranges = ["0.0.0.0/0"] +} + +# Create a standard GKE cluster +resource "google_container_cluster" "cluster" { + name = "tf-gke-cluster" + location = local.zone + network = google_compute_network.network.name + subnetwork = google_compute_subnetwork.subnetwork.name + + ip_allocation_policy { + cluster_ipv4_cidr_block = "10.124.0.0/14" + } + + deletion_protection = false + # We can't create a cluster with no node pool defined, but we want to only use + # separately managed node pools. So we create the smallest possible default + # node pool and immediately delete it. + remove_default_node_pool = true + initial_node_count = 1 +} + +resource "google_container_node_pool" "node_pool" { + name = "tf-hopsworks-node-pool" + location = local.zone + cluster = google_container_cluster.cluster.name + node_count = 1 + node_config { + machine_type = "e2-standard-8" + } +} + +resource "google_compute_firewall" "gke_traffic" { + name = "tf-hopsworks-gke-traffic" + network = google_compute_network.network.name + allow { + protocol = "all" + } + + direction = "INGRESS" + target_service_accounts = [google_service_account.service_account.email] + source_ranges = [google_container_cluster.cluster.ip_allocation_policy.0.cluster_ipv4_cidr_block] +} + +# Create a simple cluster with autoscale and GKE integration +data "hopsworksai_instance_type" "head" { + cloud_provider = "GCP" + node_type = "head" + region = local.zone +} + +data "hopsworksai_instance_type" "rondb_mgm" { + cloud_provider = "GCP" + node_type = "rondb_management" + region = local.zone +} + +data "hopsworksai_instance_type" "rondb_data" { + cloud_provider = "GCP" + node_type = "rondb_data" + region = local.zone +} + +data "hopsworksai_instance_type" "rondb_mysql" { + cloud_provider = "GCP" + node_type = "rondb_mysql" + region = local.zone +} + +data "hopsworksai_instance_type" "worker" { + cloud_provider = "GCP" + node_type = "worker" + region = local.zone + min_memory_gb = 16 + min_cpus = 8 +} + +resource "hopsworksai_cluster" "cluster" { + name = "tf-cluster" + + head { + instance_type = data.hopsworksai_instance_type.head.id + } + + autoscale { + non_gpu_workers { + instance_type = data.hopsworksai_instance_type.worker.id + disk_size = 256 + min_workers = 1 + max_workers = 5 + standby_workers = 0.5 + downscale_wait_time = 300 + } + } + + gcp_attributes { + project_id = var.project + region = var.region + zone = local.zone + service_account_email = google_service_account.service_account.email + bucket { + name = google_storage_bucket.bucket.name + } + network { + network_name = google_compute_network.network.name + subnetwork_name = google_compute_subnetwork.subnetwork.name + } + gke_cluster_name = google_container_cluster.cluster.name + } + + rondb { + management_nodes { + instance_type = data.hopsworksai_instance_type.rondb_mgm.id + } + data_nodes { + instance_type = data.hopsworksai_instance_type.rondb_data.id + } + mysql_nodes { + instance_type = data.hopsworksai_instance_type.rondb_mysql.id + } + } +} diff --git a/examples/complete/gcp/gke/standard/outputs.tf b/examples/complete/gcp/gke/standard/outputs.tf new file mode 100644 index 0000000..445cb97 --- /dev/null +++ b/examples/complete/gcp/gke/standard/outputs.tf @@ -0,0 +1,3 @@ +output "hopsworks_cluster_url" { + value = hopsworksai_cluster.cluster.url +} \ No newline at end of file diff --git a/examples/complete/gcp/gke/standard/variables.tf b/examples/complete/gcp/gke/standard/variables.tf new file mode 100644 index 0000000..02255ca --- /dev/null +++ b/examples/complete/gcp/gke/standard/variables.tf @@ -0,0 +1,8 @@ +variable "region" { + type = string + default = "europe-north1" +} + +variable "project" { + type = string +} \ No newline at end of file diff --git a/examples/complete/gcp/gke/standard/versions.tf b/examples/complete/gcp/gke/standard/versions.tf new file mode 100644 index 0000000..74e4162 --- /dev/null +++ b/examples/complete/gcp/gke/standard/versions.tf @@ -0,0 +1,13 @@ +terraform { + required_version = ">= 0.14.0" + + required_providers { + google = { + source = "hashicorp/google" + version = "5.13.0" + } + hopsworksai = { + source = "logicalclocks/hopsworksai" + } + } +} diff --git a/examples/provider/provider_azure.tf b/examples/provider/provider_azure.tf index d599802..ff2f086 100644 --- a/examples/provider/provider_azure.tf +++ b/examples/provider/provider_azure.tf @@ -36,7 +36,7 @@ module "azure" { version = "2.3.0" } -# Step 2: create a cluster with no workers +# Step 2: create a cluster with 1 worker data "hopsworksai_instance_type" "head" { cloud_provider = "AZURE" diff --git a/examples/provider/provider_gcp.tf b/examples/provider/provider_gcp.tf new file mode 100644 index 0000000..da21058 --- /dev/null +++ b/examples/provider/provider_gcp.tf @@ -0,0 +1,135 @@ +terraform { + required_version = ">= 0.14.0" + + required_providers { + google = { + source = "hashicorp/google" + version = "5.13.0" + } + hopsworksai = { + source = "logicalclocks/hopsworksai" + } + } +} + + +variable "region" { + type = string + default = "europe-north1" +} + +variable "project" { + type = string +} + +provider "google" { + region = var.region + project = var.project +} + +provider "hopsworksai" { + # Highly recommended to use the HOPSWORKSAI_API_KEY environment variable instead + api_key = "YOUR HOPSWORKS API KEY" +} + + +# Step 1: Create required google resources, a storage bucket and an service account with the required hopsworks permissions +data "hopsworksai_gcp_service_account_custom_role_permissions" "service_account" { + +} + +resource "google_project_iam_custom_role" "service_account_role" { + role_id = "tf.HopsworksAIInstances" + title = "Hopsworks AI Instances" + description = "Role that allows Hopsworks AI Instances to access resources" + permissions = data.hopsworksai_gcp_service_account_custom_role_permissions.service_account.permissions +} + +resource "google_service_account" "service_account" { + account_id = "tf-hopsworks-ai-instances" + display_name = "Hopsworks AI instances" + description = "Service account for Hopsworks AI instances" +} + +resource "google_project_iam_binding" "service_account_role_binding" { + project = var.project + role = google_project_iam_custom_role.service_account_role.id + + members = [ + google_service_account.service_account.member + ] +} + +resource "google_storage_bucket" "bucket" { + name = "tf-hopsworks-bucket" + location = var.region + force_destroy = true +} + +# Step 2: create a cluster with 1 worker + +data "google_compute_zones" "available" { + region = var.region +} + +locals { + zone = data.google_compute_zones.available.names.0 +} + +data "hopsworksai_instance_type" "head" { + cloud_provider = "GCP" + node_type = "head" + region = local.zone +} + +data "hopsworksai_instance_type" "rondb_data" { + cloud_provider = "GCP" + node_type = "rondb_data" + region = local.zone +} + +data "hopsworksai_instance_type" "small_worker" { + cloud_provider = "GCP" + node_type = "worker" + region = local.zone + min_memory_gb = 16 + min_cpus = 4 +} + +resource "hopsworksai_cluster" "cluster" { + name = "tf-cluster" + + head { + instance_type = data.hopsworksai_instance_type.head.id + } + + workers { + instance_type = data.hopsworksai_instance_type.smallest_worker.id + count = 1 + } + + gcp_attributes { + project_id = var.project + region = var.region + zone = local.zone + service_account_email = google_service_account.service_account.email + bucket { + name = google_storage_bucket.bucket.name + } + } + + rondb { + single_node { + instance_type = data.hopsworksai_instance_type.rondb_data.id + } + } + + open_ports { + ssh = true + } +} + +# Outputs the url of the newly created cluster +output "hopsworks_cluster_url" { + value = hopsworksai_cluster.cluster.url +} \ No newline at end of file diff --git a/examples/resources/hopsworksai_cluster/resource_gcp.tf b/examples/resources/hopsworksai_cluster/resource_gcp.tf new file mode 100644 index 0000000..1f07467 --- /dev/null +++ b/examples/resources/hopsworksai_cluster/resource_gcp.tf @@ -0,0 +1,38 @@ +resource "hopsworksai_cluster" "cluster" { + name = "my-cluster-name" + + head { + instance_type = "" + } + + + gcp_attributes { + project_id = "my-project" + region = "us-east1" + zone = "us-east1-b" + service_account_email = "hopsworks-ai-instances@my-project.iam.gserviceaccount.com" + bucket { + name = "my-bucket" + } + } + + rondb { + management_nodes { + instance_type = "" + } + data_nodes { + instance_type = "" + } + mysql_nodes { + instance_type = "" + } + } + + open_ports { + ssh = true + } + + tags = { + "Purpose" = "testing" + } +} \ No newline at end of file diff --git a/hopsworksai/data_source_clusters.go b/hopsworksai/data_source_clusters.go index 5e6f8ce..580cfcb 100644 --- a/hopsworksai/data_source_clusters.go +++ b/hopsworksai/data_source_clusters.go @@ -36,7 +36,7 @@ func dataSourceClusters() *schema.Resource { Description: "Filter based on cloud provider.", Type: schema.TypeString, Optional: true, - ValidateFunc: validation.StringInSlice([]string{api.AWS.String(), api.AZURE.String()}, false), + ValidateFunc: validation.StringInSlice([]string{api.AWS.String(), api.AZURE.String(), api.GCP.String()}, false), }, }, }, diff --git a/hopsworksai/data_source_clusters_test.go b/hopsworksai/data_source_clusters_test.go index 5c259bb..8a39a09 100644 --- a/hopsworksai/data_source_clusters_test.go +++ b/hopsworksai/data_source_clusters_test.go @@ -155,6 +155,12 @@ func TestClustersDataSourceRead(t *testing.T) { "name": "cluster-name-3", "createdOn": 3, "provider": "AZURE" + }, + { + "id": "cluster-4", + "name": "cluster-name-4", + "createdOn": 3, + "provider": "GCP" } ] } @@ -347,6 +353,64 @@ func TestClustersDataSourceRead(t *testing.T) { }, "update_state": "none", }, + map[string]interface{}{ + "cluster_id": "cluster-4", + "name": "cluster-name-4", + "state": "", + "activation_state": "", + "creation_date": time.Unix(3, 0).Format(time.RFC3339), + "start_date": time.Unix(0, 0).Format(time.RFC3339), + "version": "", + "url": "", + "tags": map[string]interface{}{}, + "ssh_key": "", + "head": []interface{}{ + map[string]interface{}{ + "instance_type": "", + "disk_size": 0, + "node_id": "", + "ha_enabled": false, + "private_ip": "", + }, + }, + "workers": schema.NewSet(helpers.WorkerSetHash, []interface{}{}), + "attach_public_ip": false, + "issue_lets_encrypt_certificate": false, + "managed_users": false, + "backup_retention_period": 0, + "gcp_attributes": []interface{}{ + map[string]interface{}{ + "project_id": "", + "region": "", + "zone": "", + "service_account_email": "", + "network": []interface{}{ + map[string]interface{}{ + "network_name": "", + "subnetwork_name": "", + }, + }, + "gke_cluster_name": "", + "bucket": []interface{}{ + map[string]interface{}{ + "name": "", + }, + }, + "disk_encryption": []interface{}{}, + }, + }, + "aws_attributes": []interface{}{}, + "azure_attributes": []interface{}{}, + "open_ports": []interface{}{ + map[string]interface{}{ + "ssh": false, + "kafka": false, + "feature_store": false, + "online_feature_store": false, + }, + }, + "update_state": "none", + }, }, }, } diff --git a/hopsworksai/data_source_gcp_service_account_custom_role_permissions.go b/hopsworksai/data_source_gcp_service_account_custom_role_permissions.go new file mode 100644 index 0000000..c97002a --- /dev/null +++ b/hopsworksai/data_source_gcp_service_account_custom_role_permissions.go @@ -0,0 +1,81 @@ +package hopsworksai + +import ( + "context" + "strconv" + "strings" + + "github.com/hashicorp/terraform-plugin-sdk/v2/diag" + "github.com/hashicorp/terraform-plugin-sdk/v2/helper/schema" +) + +func dataSourceGCPServiceAccountCustomRolePermissions() *schema.Resource { + return &schema.Resource{ + Description: "Use this data source to get the GCP service account custom role permissions needed by Hopsworks.ai", + Schema: map[string]*schema.Schema{ + "enable_storage": { + Description: "Add permissions required to allow Hopsworks clusters to read and write from and to your google storage bucket.", + Type: schema.TypeBool, + Optional: true, + Default: true, + }, + "enable_backup": { + Description: "Add permissions required to allow creating backups of your clusters.", + Type: schema.TypeBool, + Optional: true, + Default: true, + }, + "enable_artifact_registry": { + Description: "Add permissions required to enable access to the artifact registry", + Type: schema.TypeBool, + Optional: true, + Default: true, + }, + "permissions": { + Description: "The list of permissions.", + Type: schema.TypeList, + Computed: true, + Elem: &schema.Schema{ + Type: schema.TypeString, + }, + }, + }, + ReadContext: dataSourceGCPServiceAccountCustomRolePermissionsRead, + } +} + +func dataSourceGCPServiceAccountCustomRolePermissionsRead(ctx context.Context, d *schema.ResourceData, meta interface{}) diag.Diagnostics { + permissions := []string{} + + if d.Get("enable_storage").(bool) { + permissions = append(permissions, "storage.buckets.get", + "storage.multipartUploads.abort", + "storage.multipartUploads.create", + "storage.multipartUploads.list", + "storage.multipartUploads.listParts", + "storage.objects.create", + "storage.objects.delete", + "storage.objects.get", + "storage.objects.list", + "storage.objects.update") + } + + if d.Get("enable_backup").(bool) { + permissions = append(permissions, "storage.buckets.update") + } + + if d.Get("enable_artifact_registry").(bool) { + permissions = append(permissions, "artifactregistry.repositories.create", + "artifactregistry.repositories.get", + "artifactregistry.repositories.uploadArtifacts", + "artifactregistry.repositories.downloadArtifacts", + "artifactregistry.tags.list", + "artifactregistry.tags.delete") + } + + d.SetId(strconv.Itoa(schema.HashString(strings.Join(permissions, ",")))) + if err := d.Set("permissions", permissions); err != nil { + return diag.FromErr(err) + } + return nil +} diff --git a/hopsworksai/data_source_gcp_service_account_custom_role_permissions_test.go b/hopsworksai/data_source_gcp_service_account_custom_role_permissions_test.go new file mode 100644 index 0000000..b77c1cc --- /dev/null +++ b/hopsworksai/data_source_gcp_service_account_custom_role_permissions_test.go @@ -0,0 +1,139 @@ +package hopsworksai + +import ( + "testing" + + "github.com/hashicorp/terraform-plugin-testing/helper/resource" +) + +func TestAccGCPServiceAccountCustomRole_basic(t *testing.T) { + dataSourceName := "data.hopsworksai_gcp_service_account_custom_role_permissions.test" + resource.UnitTest(t, resource.TestCase{ + ProviderFactories: testAccProviderFactories, + Steps: []resource.TestStep{ + { + Config: testAccGCPServiceAccountCustomRole_basic(), + Check: resource.ComposeTestCheckFunc( + resource.TestCheckResourceAttr(dataSourceName, "permissions.#", "17"), + resource.TestCheckResourceAttr(dataSourceName, "permissions.0", "storage.buckets.get"), + resource.TestCheckResourceAttr(dataSourceName, "permissions.1", "storage.multipartUploads.abort"), + resource.TestCheckResourceAttr(dataSourceName, "permissions.2", "storage.multipartUploads.create"), + resource.TestCheckResourceAttr(dataSourceName, "permissions.3", "storage.multipartUploads.list"), + resource.TestCheckResourceAttr(dataSourceName, "permissions.4", "storage.multipartUploads.listParts"), + resource.TestCheckResourceAttr(dataSourceName, "permissions.5", "storage.objects.create"), + resource.TestCheckResourceAttr(dataSourceName, "permissions.6", "storage.objects.delete"), + resource.TestCheckResourceAttr(dataSourceName, "permissions.7", "storage.objects.get"), + resource.TestCheckResourceAttr(dataSourceName, "permissions.8", "storage.objects.list"), + resource.TestCheckResourceAttr(dataSourceName, "permissions.9", "storage.objects.update"), + resource.TestCheckResourceAttr(dataSourceName, "permissions.10", "storage.buckets.update"), + resource.TestCheckResourceAttr(dataSourceName, "permissions.11", "artifactregistry.repositories.create"), + resource.TestCheckResourceAttr(dataSourceName, "permissions.12", "artifactregistry.repositories.get"), + resource.TestCheckResourceAttr(dataSourceName, "permissions.13", "artifactregistry.repositories.uploadArtifacts"), + resource.TestCheckResourceAttr(dataSourceName, "permissions.14", "artifactregistry.repositories.downloadArtifacts"), + resource.TestCheckResourceAttr(dataSourceName, "permissions.15", "artifactregistry.tags.list"), + resource.TestCheckResourceAttr(dataSourceName, "permissions.16", "artifactregistry.tags.delete"), + ), + }, + }, + }) +} + +func TestAccGCPServiceAccountCustomRole_noBackup(t *testing.T) { + dataSourceName := "data.hopsworksai_gcp_service_account_custom_role_permissions.test" + resource.UnitTest(t, resource.TestCase{ + ProviderFactories: testAccProviderFactories, + Steps: []resource.TestStep{ + { + Config: testAccGCPServiceAccountCustomRole_noBackup(), + Check: resource.ComposeTestCheckFunc( + resource.TestCheckResourceAttr(dataSourceName, "permissions.#", "16"), + resource.TestCheckResourceAttr(dataSourceName, "permissions.0", "storage.buckets.get"), + resource.TestCheckResourceAttr(dataSourceName, "permissions.1", "storage.multipartUploads.abort"), + resource.TestCheckResourceAttr(dataSourceName, "permissions.2", "storage.multipartUploads.create"), + resource.TestCheckResourceAttr(dataSourceName, "permissions.3", "storage.multipartUploads.list"), + resource.TestCheckResourceAttr(dataSourceName, "permissions.4", "storage.multipartUploads.listParts"), + resource.TestCheckResourceAttr(dataSourceName, "permissions.5", "storage.objects.create"), + resource.TestCheckResourceAttr(dataSourceName, "permissions.6", "storage.objects.delete"), + resource.TestCheckResourceAttr(dataSourceName, "permissions.7", "storage.objects.get"), + resource.TestCheckResourceAttr(dataSourceName, "permissions.8", "storage.objects.list"), + resource.TestCheckResourceAttr(dataSourceName, "permissions.9", "storage.objects.update"), + resource.TestCheckResourceAttr(dataSourceName, "permissions.10", "artifactregistry.repositories.create"), + resource.TestCheckResourceAttr(dataSourceName, "permissions.11", "artifactregistry.repositories.get"), + resource.TestCheckResourceAttr(dataSourceName, "permissions.12", "artifactregistry.repositories.uploadArtifacts"), + resource.TestCheckResourceAttr(dataSourceName, "permissions.13", "artifactregistry.repositories.downloadArtifacts"), + resource.TestCheckResourceAttr(dataSourceName, "permissions.14", "artifactregistry.tags.list"), + resource.TestCheckResourceAttr(dataSourceName, "permissions.15", "artifactregistry.tags.delete"), + ), + }, + }, + }) +} + +func TestAccGCPServiceAccountCustomRole_noStorage(t *testing.T) { + dataSourceName := "data.hopsworksai_gcp_service_account_custom_role_permissions.test" + resource.UnitTest(t, resource.TestCase{ + ProviderFactories: testAccProviderFactories, + Steps: []resource.TestStep{ + { + Config: testAccGCPServiceAccountCustomRole_noStorage(), + Check: resource.ComposeTestCheckFunc( + resource.TestCheckResourceAttr(dataSourceName, "permissions.#", "6"), + resource.TestCheckResourceAttr(dataSourceName, "permissions.0", "artifactregistry.repositories.create"), + resource.TestCheckResourceAttr(dataSourceName, "permissions.1", "artifactregistry.repositories.get"), + resource.TestCheckResourceAttr(dataSourceName, "permissions.2", "artifactregistry.repositories.uploadArtifacts"), + resource.TestCheckResourceAttr(dataSourceName, "permissions.3", "artifactregistry.repositories.downloadArtifacts"), + resource.TestCheckResourceAttr(dataSourceName, "permissions.4", "artifactregistry.tags.list"), + resource.TestCheckResourceAttr(dataSourceName, "permissions.5", "artifactregistry.tags.delete"), + ), + }, + }, + }) +} +func TestAccGCPServiceAccountCustomRole_noPerm(t *testing.T) { + dataSourceName := "data.hopsworksai_gcp_service_account_custom_role_permissions.test" + resource.UnitTest(t, resource.TestCase{ + ProviderFactories: testAccProviderFactories, + Steps: []resource.TestStep{ + { + Config: testAccGCPServiceAccountCustomRole_noPerm(), + Check: resource.ComposeTestCheckFunc( + resource.TestCheckResourceAttr(dataSourceName, "permissions.#", "0"), + ), + }, + }, + }) +} + +func testAccGCPServiceAccountCustomRole_basic() string { + return ` + data "hopsworksai_gcp_service_account_custom_role_permissions" "test" { + } + ` +} + +func testAccGCPServiceAccountCustomRole_noBackup() string { + return ` + data "hopsworksai_gcp_service_account_custom_role_permissions" "test" { + enable_backup = false + } + ` +} + +func testAccGCPServiceAccountCustomRole_noStorage() string { + return ` + data "hopsworksai_gcp_service_account_custom_role_permissions" "test" { + enable_backup = false + enable_storage = false + } + ` +} + +func testAccGCPServiceAccountCustomRole_noPerm() string { + return ` + data "hopsworksai_gcp_service_account_custom_role_permissions" "test" { + enable_backup = false + enable_storage = false + enable_artifact_registry = false + } + ` +} diff --git a/hopsworksai/data_source_instance_type.go b/hopsworksai/data_source_instance_type.go index db108b0..edbba6e 100644 --- a/hopsworksai/data_source_instance_type.go +++ b/hopsworksai/data_source_instance_type.go @@ -25,10 +25,10 @@ func dataSourceInstanceType() *schema.Resource { Description: "The cloud provider where you plan to create your cluster.", Type: schema.TypeString, Required: true, - ValidateFunc: validation.StringInSlice([]string{api.AWS.String(), api.AZURE.String()}, false), + ValidateFunc: validation.StringInSlice([]string{api.AWS.String(), api.AZURE.String(), api.GCP.String()}, false), }, "region": { - Description: "The region/location where you plan to create your cluster.", + Description: "The region/location/zone where you plan to create your cluster. In case of GCP you should use the zone name.", Type: schema.TypeString, Required: true, }, diff --git a/hopsworksai/data_source_instance_types.go b/hopsworksai/data_source_instance_types.go index 5e3e236..3298e16 100644 --- a/hopsworksai/data_source_instance_types.go +++ b/hopsworksai/data_source_instance_types.go @@ -25,10 +25,10 @@ func dataSourceInstanceTypes() *schema.Resource { Description: "The cloud provider where you plan to create your cluster.", Type: schema.TypeString, Required: true, - ValidateFunc: validation.StringInSlice([]string{api.AWS.String(), api.AZURE.String()}, false), + ValidateFunc: validation.StringInSlice([]string{api.AWS.String(), api.AZURE.String(), api.GCP.String()}, false), }, "region": { - Description: "The region/location where you plan to create your cluster.", + Description: "The region/location/zone where you plan to create your cluster. In case of GCP you should use the zone name.", Type: schema.TypeString, Required: true, }, diff --git a/hopsworksai/internal/api/apis.go b/hopsworksai/internal/api/apis.go index 2a7277d..1cc6867 100644 --- a/hopsworksai/internal/api/apis.go +++ b/hopsworksai/internal/api/apis.go @@ -31,6 +31,9 @@ func NewCluster(ctx context.Context, apiClient APIHandler, createRequest interfa case CreateAWSCluster, *CreateAWSCluster: tflog.Debug(ctx, fmt.Sprintf("new aws cluster: %#v", createRequest)) cloudProvider = AWS + case CreateGCPCluster, *CreateGCPCluster: + tflog.Debug(ctx, fmt.Sprintf("new gcp cluster: %#v", createRequest)) + cloudProvider = GCP default: return "", fmt.Errorf("unknown cloud provider %#v", createRequest) } @@ -156,10 +159,13 @@ func GetSupportedInstanceTypes(ctx context.Context, apiClient APIHandler, cloud if err := apiClient.doRequest(ctx, http.MethodGet, url, nil, &response); err != nil { return nil, err } - if cloud == AWS { + switch cloud { + case AWS: return &response.Payload.AWS, nil - } else if cloud == AZURE { + case AZURE: return &response.Payload.AZURE, nil + case GCP: + return &response.Payload.GCP, nil } return nil, fmt.Errorf("unknown cloud provider %s", cloud.String()) } @@ -243,6 +249,8 @@ func NewClusterFromBackup(ctx context.Context, apiClient APIHandler, backupId st tflog.Debug(ctx, fmt.Sprintf("restore aws cluster: #%v", createRequest)) case CreateAzureClusterFromBackup, *CreateAzureClusterFromBackup: tflog.Debug(ctx, fmt.Sprintf("restore azure cluster: #%v", createRequest)) + case CreateGCPClusterFromBackup, *CreateGCPClusterFromBackup: + tflog.Debug(ctx, fmt.Sprintf("restore gcp cluster: #%v", createRequest)) default: return "", fmt.Errorf("unknown create request #%v", createRequest) } diff --git a/hopsworksai/internal/api/apis_test.go b/hopsworksai/internal/api/apis_test.go index 8d71eca..87aeee8 100644 --- a/hopsworksai/internal/api/apis_test.go +++ b/hopsworksai/internal/api/apis_test.go @@ -897,6 +897,211 @@ func TestNewClusterAZURE(t *testing.T) { } } +func TestNewClusterGCP(t *testing.T) { + apiClient := &HopsworksAIClient{ + Client: &test.HttpClientFixture{ + ExpectMethod: http.MethodPost, + ExpectPath: "/api/clusters", + ExpectRequestBody: `{ + "cloudProvider": "GCP", + "cluster": { + "name": "cluster-1", + "version": "2.0", + "sshKeyName": "ssh-key-1", + "clusterConfiguration": { + "head": { + "instanceType": "node-type-1", + "diskSize": 512, + "haEnabled": false + }, + "workers": [ + { + "instanceType": "node-type-2", + "diskSize": 256, + "count": 2 + } + ] + }, + "issueLetsEncrypt": true, + "attachPublicIP": true, + "backupRetentionPeriod": 10, + "managedUsers": true, + "tags": [ + { + "name": "tag1", + "value": "tag1-value1" + } + ], + "ronDB": { + "allInOne": false, + "configuration": { + "ndbdDefault": { + "replicationFactor": 2 + }, + "general": { + "benchmark": { + "grantUserPrivileges": false + } + } + }, + "mgmd": { + "instanceType": "mgm-node-1", + "diskSize": 30, + "count": 1 + }, + "ndbd": { + "instanceType": "data-node-1", + "diskSize": 512, + "count": 2 + }, + "mysqld": { + "instanceType": "mysqld-node-1", + "diskSize": 100, + "count": 1, + "arrowFlight": false + }, + "api": { + "instanceType": "api-node-1", + "diskSize": 50, + "count": 1 + } + }, + "initScript": "", + "runInitScriptFirst": false, + "deactivateLogReport": false, + "collectLogs": false, + "clusterDomainPrefix": "my-prefix", + "customHostedZone": "custom.zone.ai", + "project": "project-1", + "region": "region-1", + "zone": "zone-1", + "serviceAccountEmail": "service-account-1", + "bucketName": "bucket-1", + "networkName": "network-1", + "subNetworkName": "sub-1", + "gkeClusterName": "gke-cluster-1", + "diskEncryption": { + "customerManagedKey": "key-1" + } + + } + }`, + ResponseCode: http.StatusOK, + ResponseBody: `{ + "apiVersion": "v1", + "status": "ok", + "code": 200, + "payload":{ + "id" : "new-cluster-id-1" + } + }`, + T: t, + }, + } + + input := CreateGCPCluster{ + CreateCluster: CreateCluster{ + Name: "cluster-1", + Version: "2.0", + SshKeyName: "ssh-key-1", + ClusterConfiguration: ClusterConfiguration{ + Head: HeadConfiguration{ + NodeConfiguration: NodeConfiguration{ + InstanceType: "node-type-1", + DiskSize: 512, + }, + }, + Workers: []WorkerConfiguration{ + { + NodeConfiguration: NodeConfiguration{ + InstanceType: "node-type-2", + DiskSize: 256, + }, + Count: 2, + }, + }, + }, + IssueLetsEncrypt: true, + AttachPublicIP: true, + BackupRetentionPeriod: 10, + ManagedUsers: true, + Tags: []ClusterTag{ + { + Name: "tag1", + Value: "tag1-value1", + }, + }, + ClusterDomainPrefix: "my-prefix", + CustomHostedZone: "custom.zone.ai", + RonDB: &RonDBConfiguration{ + Configuration: RonDBBaseConfiguration{ + NdbdDefault: RonDBNdbdDefaultConfiguration{ + ReplicationFactor: 2, + }, + General: RonDBGeneralConfiguration{ + Benchmark: RonDBBenchmarkConfiguration{ + GrantUserPrivileges: false, + }, + }, + }, + ManagementNodes: RonDBNodeConfiguration{ + NodeConfiguration: NodeConfiguration{ + InstanceType: "mgm-node-1", + DiskSize: 30, + }, + Count: 1, + }, + DataNodes: RonDBNodeConfiguration{ + NodeConfiguration: NodeConfiguration{ + InstanceType: "data-node-1", + DiskSize: 512, + }, + Count: 2, + }, + MYSQLNodes: MYSQLNodeConfiguration{ + RonDBNodeConfiguration: RonDBNodeConfiguration{ + NodeConfiguration: NodeConfiguration{ + InstanceType: "mysqld-node-1", + DiskSize: 100, + }, + Count: 1, + }, + ArrowFlightServer: false, + }, + APINodes: RonDBNodeConfiguration{ + NodeConfiguration: NodeConfiguration{ + InstanceType: "api-node-1", + DiskSize: 50, + }, + Count: 1, + }, + }, + }, + GCPCluster: GCPCluster{ + Project: "project-1", + Region: "region-1", + Zone: "zone-1", + ServiceAccountEmail: "service-account-1", + BucketName: "bucket-1", + NetworkName: "network-1", + SubNetworkName: "sub-1", + GkeClusterName: "gke-cluster-1", + DiskEncryption: &GCPDiskEncryption{ + CustomerManagedKey: "key-1", + }, + }, + } + + clusterId, err := NewCluster(context.TODO(), apiClient, input) + if err != nil { + t.Fatalf("new cluster should not throw error, but got %s", err) + } + + if clusterId != "new-cluster-id-1" { + t.Fatalf("new cluster should return the new cluster id, expected: new-cluster-id-1, got %s", clusterId) + } +} + func TestNewClusterInvalidCloud(t *testing.T) { clusterId, err := NewCluster(context.TODO(), nil, struct{}{}) if err == nil { @@ -1313,6 +1518,8 @@ func TestGetSupportedInstanceTypes(t *testing.T) { testGetSupportedInstanceTypes(t, AWS, "region1") testGetSupportedInstanceTypes(t, AZURE, "") testGetSupportedInstanceTypes(t, AZURE, "region1") + testGetSupportedInstanceTypes(t, GCP, "") + testGetSupportedInstanceTypes(t, GCP, "region1") } func TestGetSupportedInstanceTypes_unknownProvider(t *testing.T) { @@ -2729,3 +2936,89 @@ func TestNewClusterAWS_withArrowFlight(t *testing.T) { t.Fatalf("new cluster should return the new cluster id, expected: new-cluster-id-1, got %s", clusterId) } } + +func TestNewClusterFromBackup_GCP_changeConfig(t *testing.T) { + apiClient := &HopsworksAIClient{ + Client: &test.HttpClientFixture{ + ExpectMethod: http.MethodPost, + ExpectPath: "/api/clusters/restore/backup-id-1", + ExpectRequestBody: `{ + "cluster":{ + "name": "new-cluster-name", + "sshKeyName": "new-ssh-key", + "tags": [ + { + "name": "tag1", + "value": "tag1-value" + } + ], + "autoscale":{ + "nonGpu": { + "instanceType": "new-node-type", + "diskSize": 512, + "minWorkers": 1, + "maxWorkers": 10, + "standbyWorkers": 0.7, + "downscaleWaitTime": 500, + "spotInfo": { + "maxPrice": 100, + "fallBackOnDemand": false + } + } + }, + "serviceAccountEmail": "service@account.ai", + "networkName": "network-name-1", + "subNetworkName": "sub-name-1" + } + }`, + ResponseCode: http.StatusOK, + ResponseBody: `{ + "apiVersion": "v1", + "status": "ok", + "code": 200, + "payload": { + "id": "cluster-id-1" + } + }`, + T: t, + }, + } + + id, err := NewClusterFromBackup(context.TODO(), apiClient, "backup-id-1", CreateGCPClusterFromBackup{ + CreateClusterFromBackup: CreateClusterFromBackup{ + Name: "new-cluster-name", + SshKeyName: "new-ssh-key", + Tags: []ClusterTag{ + { + Name: "tag1", + Value: "tag1-value", + }, + }, + Autoscale: &AutoscaleConfiguration{ + NonGPU: &AutoscaleConfigurationBase{ + InstanceType: "new-node-type", + DiskSize: 512, + MinWorkers: 1, + MaxWorkers: 10, + StandbyWorkers: 0.7, + DownscaleWaitTime: 500, + SpotInfo: &SpotConfiguration{ + MaxPrice: 100, + FallBackOnDemand: false, + }, + }, + }, + }, + ServiceAccountEmail: "service@account.ai", + NetworkName: "network-name-1", + SubNetworkName: "sub-name-1", + }) + + if err != nil { + t.Fatalf("should not throw an error, but got %s", err) + } + + if id != "cluster-id-1" { + t.Fatalf("expected cluster id (cluster-id-1), but got %s", id) + } +} diff --git a/hopsworksai/internal/api/model.go b/hopsworksai/internal/api/model.go index 1b5f3a7..b57b690 100644 --- a/hopsworksai/internal/api/model.go +++ b/hopsworksai/internal/api/model.go @@ -11,6 +11,7 @@ type CloudProvider string const ( AWS CloudProvider = "AWS" AZURE CloudProvider = "AZURE" + GCP CloudProvider = "GCP" ) func (c CloudProvider) String() string { @@ -181,6 +182,7 @@ type Cluster struct { BackupRetentionPeriod int `json:"backupRetentionPeriod"` Azure AzureCluster `json:"azure,omitempty"` AWS AWSCluster `json:"aws,omitempty"` + GCP GCPCluster `json:"gcp,omitempty"` Ports ServiceOpenPorts `json:"ports"` RonDB *RonDBConfiguration `json:"ronDB,omitempty"` Autoscale *AutoscaleConfiguration `json:"autoscale,omitempty"` @@ -203,6 +205,10 @@ func (c *Cluster) IsAzureCluster() bool { return c.Provider == AZURE } +func (c *Cluster) IsGCPCluster() bool { + return c.Provider == GCP +} + type AzureEncryptionConfiguration struct { Mode string `json:"mode"` } @@ -261,6 +267,22 @@ type AWSCluster struct { EBSEncryption *EBSEncryption `json:"ebsEncryption,omitempty"` } +type GCPDiskEncryption struct { + CustomerManagedKey string `json:"customerManagedKey"` +} + +type GCPCluster struct { + Project string `json:"project"` + Region string `json:"region"` + Zone string `json:"zone"` + ServiceAccountEmail string `json:"serviceAccountEmail"` + BucketName string `json:"bucketName"` + NetworkName string `json:"networkName"` + SubNetworkName string `json:"subNetworkName"` + GkeClusterName string `json:"gkeClusterName"` + DiskEncryption *GCPDiskEncryption `json:"diskEncryption,omitempty"` +} + type NodeConfiguration struct { InstanceType string `json:"instanceType"` DiskSize int `json:"diskSize"` @@ -330,6 +352,11 @@ type CreateAWSCluster struct { AWSCluster } +type CreateGCPCluster struct { + CreateCluster + GCPCluster +} + type NodeType string const ( @@ -436,6 +463,11 @@ type NewAzureClusterRequest struct { CreateRequest CreateAzureCluster `json:"cluster"` } +type NewGCPClusterRequest struct { + CloudProvider CloudProvider `json:"cloudProvider"` + CreateRequest CreateGCPCluster `json:"cluster"` +} + type NewClusterResponse struct { BaseResponse Payload struct { @@ -463,6 +495,7 @@ type GetSupportedInstanceTypesResponse struct { Payload struct { AWS SupportedInstanceTypes `json:"aws"` AZURE SupportedInstanceTypes `json:"azure"` + GCP SupportedInstanceTypes `json:"gcp"` } `json:"payload"` } @@ -549,6 +582,13 @@ type CreateAWSClusterFromBackup struct { SecurityGroupId string `json:"securityGroupId,omitempty"` } +type CreateGCPClusterFromBackup struct { + CreateClusterFromBackup + ServiceAccountEmail string `json:"serviceAccountEmail"` + NetworkName string `json:"networkName"` + SubNetworkName string `json:"subNetworkName"` +} + type NewClusterFromBackupRequest struct { CreateRequest interface{} `json:"cluster"` } diff --git a/hopsworksai/internal/api/model_test.go b/hopsworksai/internal/api/model_test.go index 925b704..ec41894 100644 --- a/hopsworksai/internal/api/model_test.go +++ b/hopsworksai/internal/api/model_test.go @@ -40,6 +40,23 @@ func TestIsAZURECluster(t *testing.T) { } } +func TestIsGCPCluster(t *testing.T) { + cluster := &Cluster{ + Provider: GCP, + } + if !cluster.IsGCPCluster() { + t.Fatal("is gcp cluster should return true") + } + cluster.Provider = AWS + if cluster.IsGCPCluster() { + t.Fatal("is gcp cluster should return false") + } + cluster.Provider = "" + if cluster.IsGCPCluster() { + t.Fatal("is gcp cluster should return false") + } +} + func TestValidateResponse(t *testing.T) { resp := BaseResponse{} diff --git a/hopsworksai/internal/structure/cluster.go b/hopsworksai/internal/structure/cluster.go index da0cfd6..38a0d65 100644 --- a/hopsworksai/internal/structure/cluster.go +++ b/hopsworksai/internal/structure/cluster.go @@ -36,6 +36,7 @@ func FlattenCluster(cluster *api.Cluster) map[string]interface{} { "workers": flattenWorkers(cluster.Autoscale, cluster.ClusterConfiguration.Workers), "aws_attributes": flattenAWSAttributes(cluster), "azure_attributes": flattenAzureAttributes(cluster), + "gcp_attributes": flattenGCPAttributes(cluster), "open_ports": flattenPorts(&cluster.Ports), "tags": flattenTags(cluster.Tags), "rondb": flattenRonDB(cluster.RonDB), @@ -481,3 +482,43 @@ func flattenUpgradeInProgress(upgradeInProgress *api.UpgradeInProgress) []interf }, } } + +func flattenGCPDiskEncryption(diskEncryption *api.GCPDiskEncryption) []map[string]interface{} { + if diskEncryption == nil { + return []map[string]interface{}{} + } + + return []map[string]interface{}{ + { + "customer_managed_encryption_key": diskEncryption.CustomerManagedKey, + }, + } +} + +func flattenGCPAttributes(cluster *api.Cluster) []interface{} { + if !cluster.IsGCPCluster() { + return nil + } + + gcpAttributes := make([]interface{}, 1) + gcpAttributes[0] = map[string]interface{}{ + "project_id": cluster.GCP.Project, + "region": cluster.GCP.Region, + "zone": cluster.GCP.Zone, + "service_account_email": cluster.GCP.ServiceAccountEmail, + "bucket": []map[string]interface{}{ + { + "name": cluster.GCP.BucketName, + }, + }, + "network": []map[string]interface{}{ + { + "network_name": cluster.GCP.NetworkName, + "subnetwork_name": cluster.GCP.SubNetworkName, + }, + }, + "gke_cluster_name": cluster.GCP.GkeClusterName, + "disk_encryption": flattenGCPDiskEncryption(cluster.GCP.DiskEncryption), + } + return gcpAttributes +} diff --git a/hopsworksai/internal/structure/cluster_test.go b/hopsworksai/internal/structure/cluster_test.go index 4901e56..e75781e 100644 --- a/hopsworksai/internal/structure/cluster_test.go +++ b/hopsworksai/internal/structure/cluster_test.go @@ -425,6 +425,7 @@ func TestFlattenCluster(t *testing.T) { "workers": flattenWorkers(input.Autoscale, input.ClusterConfiguration.Workers), "aws_attributes": emptyAttributes, "azure_attributes": emptyAttributes, + "gcp_attributes": emptyAttributes, "open_ports": flattenPorts(&input.Ports), "tags": flattenTags(input.Tags), "rondb": flattenRonDB(input.RonDB), @@ -439,9 +440,10 @@ func TestFlattenCluster(t *testing.T) { "custom_hosted_zone": input.CustomHostedZone, } - for _, cloud := range []api.CloudProvider{api.AWS, api.AZURE} { + for _, cloud := range []api.CloudProvider{api.AWS, api.AZURE, api.GCP} { input.Provider = cloud - if cloud == api.AWS { + switch cloud { + case api.AWS: input.AWS = api.AWSCluster{ Region: "region-1", InstanceProfileArn: "instance-profile-1", @@ -453,9 +455,11 @@ func TestFlattenCluster(t *testing.T) { EcrRegistryAccountId: "ecr-registry-account-1", } input.Azure = api.AzureCluster{} + input.GCP = api.GCPCluster{} expected["aws_attributes"] = flattenAWSAttributes(input) expected["azure_attributes"] = emptyAttributes - } else if cloud == api.AZURE { + expected["gcp_attributes"] = emptyAttributes + case api.AZURE: input.Azure = api.AzureCluster{ Location: "location-1", ResourceGroup: "resource-group-1", @@ -472,6 +476,23 @@ func TestFlattenCluster(t *testing.T) { input.AWS = api.AWSCluster{} expected["aws_attributes"] = emptyAttributes expected["azure_attributes"] = flattenAzureAttributes(input) + expected["gcp_attributes"] = emptyAttributes + case api.GCP: + input.GCP = api.GCPCluster{ + Project: "project-1", + Region: "region-1", + Zone: "zone-1", + ServiceAccountEmail: "serviceaccount@iam.com", + BucketName: "my-bucket", + NetworkName: "my-network", + SubNetworkName: "my-subnetwork", + GkeClusterName: "my-gke-cluster", + } + input.AWS = api.AWSCluster{} + input.Azure = api.AzureCluster{} + expected["aws_attributes"] = emptyAttributes + expected["azure_attributes"] = emptyAttributes + expected["gcp_attributes"] = flattenGCPAttributes(input) } output := FlattenCluster(input) @@ -1344,6 +1365,119 @@ func TestFlattenClusters(t *testing.T) { SearchDomain: "internal.cloudapp.net", }, }, + { + Id: "cluster-id-2", + Name: "cluster", + State: "state-1", + ActivationState: "activation-state-1", + InitializationStage: "initializtion-stage-1", + CreatedOn: 1605374387069, + StartedOn: 1605374388069, + Version: "cluster-version", + URL: "cluster-url", + Provider: api.GCP, + Tags: []api.ClusterTag{ + { + Name: "tag1", + Value: "tagvalue1", + }, + }, + PublicIPAttached: true, + LetsEncryptIssued: true, + ManagedUsers: true, + BackupRetentionPeriod: 0, + ClusterConfiguration: api.ClusterConfigurationStatus{ + Head: api.HeadConfigurationStatus{ + HeadConfiguration: api.HeadConfiguration{ + NodeConfiguration: api.NodeConfiguration{ + InstanceType: "head-node-type-1", + DiskSize: 512, + }, + HAEnabled: false, + }, + NodeId: "head-node-id-1", + }, + Workers: []api.WorkerConfiguration{ + { + NodeConfiguration: api.NodeConfiguration{ + InstanceType: "worker-node-type-1", + DiskSize: 256, + }, + Count: 1, + }, + }, + }, + Ports: api.ServiceOpenPorts{ + FeatureStore: true, + OnlineFeatureStore: false, + Kafka: true, + SSH: false, + }, + RonDB: &api.RonDBConfiguration{ + Configuration: api.RonDBBaseConfiguration{ + NdbdDefault: api.RonDBNdbdDefaultConfiguration{ + ReplicationFactor: 2, + }, + General: api.RonDBGeneralConfiguration{ + Benchmark: api.RonDBBenchmarkConfiguration{ + GrantUserPrivileges: false, + }, + }, + }, + ManagementNodes: api.RonDBNodeConfiguration{ + NodeConfiguration: api.NodeConfiguration{ + InstanceType: "mgm-node-1", + DiskSize: 30, + }, + Count: 1, + }, + DataNodes: api.RonDBNodeConfiguration{ + NodeConfiguration: api.NodeConfiguration{ + InstanceType: "data-node-1", + DiskSize: 512, + }, + Count: 2, + }, + MYSQLNodes: api.MYSQLNodeConfiguration{ + RonDBNodeConfiguration: api.RonDBNodeConfiguration{ + NodeConfiguration: api.NodeConfiguration{ + InstanceType: "mysqld-node-1", + DiskSize: 100, + }, + Count: 1, + }, + ArrowFlightServer: false, + }, + APINodes: api.RonDBNodeConfiguration{ + NodeConfiguration: api.NodeConfiguration{ + InstanceType: "api-node-1", + DiskSize: 50, + }, + Count: 1, + }, + }, + Autoscale: &api.AutoscaleConfiguration{ + NonGPU: &api.AutoscaleConfigurationBase{ + InstanceType: "auto-node-1", + DiskSize: 256, + MinWorkers: 0, + MaxWorkers: 10, + StandbyWorkers: 0.5, + DownscaleWaitTime: 300, + }, + }, + InitScript: "#!/usr/bin/env bash\nset -e\necho 'Hello World'", + GCP: api.GCPCluster{ + Project: "project-1", + Region: "region-1", + Zone: "zone-1", + ServiceAccountEmail: "serviceaccount@iam.com", + BucketName: "my-bucket", + NetworkName: "my-network", + SubNetworkName: "my-subnetwork", + GkeClusterName: "my-gke-cluster", + }, + }, } var emptyAttributes []interface{} = nil @@ -1367,6 +1501,7 @@ func TestFlattenClusters(t *testing.T) { "workers": flattenWorkers(input[0].Autoscale, input[0].ClusterConfiguration.Workers), "aws_attributes": flattenAWSAttributes(&input[0]), "azure_attributes": emptyAttributes, + "gcp_attributes": emptyAttributes, "open_ports": flattenPorts(&input[0].Ports), "tags": flattenTags(input[0].Tags), "rondb": flattenRonDB(input[0].RonDB), @@ -1392,12 +1527,39 @@ func TestFlattenClusters(t *testing.T) { "workers": flattenWorkers(input[1].Autoscale, input[1].ClusterConfiguration.Workers), "aws_attributes": emptyAttributes, "azure_attributes": flattenAzureAttributes(&input[1]), + "gcp_attributes": emptyAttributes, "open_ports": flattenPorts(&input[1].Ports), "tags": flattenTags(input[1].Tags), "rondb": flattenRonDB(input[1].RonDB), "autoscale": flattenAutoscaleConfiguration(input[1].Autoscale), "init_script": input[1].InitScript, }, + { + "cluster_id": input[2].Id, + "name": input[2].Name, + "url": input[2].URL, + "state": input[2].State, + "activation_state": input[2].ActivationState, + "creation_date": time.Unix(input[0].CreatedOn, 0).Format(time.RFC3339), + "start_date": time.Unix(input[0].StartedOn, 0).Format(time.RFC3339), + "version": input[2].Version, + "ssh_key": input[2].SshKeyName, + "head": flattenHead(&input[0].ClusterConfiguration.Head), + "issue_lets_encrypt_certificate": input[2].LetsEncryptIssued, + "attach_public_ip": input[2].PublicIPAttached, + "managed_users": input[2].ManagedUsers, + "backup_retention_period": input[2].BackupRetentionPeriod, + "update_state": "none", + "workers": flattenWorkers(input[0].Autoscale, input[0].ClusterConfiguration.Workers), + "aws_attributes": emptyAttributes, + "azure_attributes": emptyAttributes, + "gcp_attributes": flattenGCPAttributes(&input[2]), + "open_ports": flattenPorts(&input[2].Ports), + "tags": flattenTags(input[2].Tags), + "rondb": flattenRonDB(input[2].RonDB), + "autoscale": flattenAutoscaleConfiguration(input[2].Autoscale), + "init_script": input[2].InitScript, + }, } output := FlattenClusters(input) @@ -1905,3 +2067,99 @@ func TestExpandRonDBMySQLNodeConfiguration(t *testing.T) { t.Fatalf("error while matching:\nexpected %#v \nbut got %#v", expected, output) } } + +func TestFlattenGCPAttributes(t *testing.T) { + input := &api.Cluster{ + Provider: api.GCP, + GCP: api.GCPCluster{ + Project: "project-1", + Region: "region-1", + Zone: "zone-1", + ServiceAccountEmail: "serviceaccount@iam.com", + BucketName: "my-bucket", + NetworkName: "my-network", + SubNetworkName: "my-subnetwork", + GkeClusterName: "my-gke-cluster", + DiskEncryption: &api.GCPDiskEncryption{ + CustomerManagedKey: "customer-managed-key", + }, + }, + } + + expected := []interface{}{ + map[string]interface{}{ + "project_id": input.GCP.Project, + "region": input.GCP.Region, + "zone": input.GCP.Zone, + "service_account_email": input.GCP.ServiceAccountEmail, + "network": []map[string]interface{}{ + { + "network_name": input.GCP.NetworkName, + "subnetwork_name": input.GCP.SubNetworkName, + }, + }, + "gke_cluster_name": input.GCP.GkeClusterName, + "bucket": []map[string]interface{}{ + { + "name": input.GCP.BucketName, + }, + }, + "disk_encryption": []map[string]interface{}{ + { + "customer_managed_encryption_key": input.GCP.DiskEncryption.CustomerManagedKey, + }, + }, + }, + } + + output := flattenGCPAttributes(input) + if !reflect.DeepEqual(expected, output) { + t.Fatalf("error while matching:\nexpected %#v \nbut got %#v", expected, output) + } + + input.Provider = "" + if flattenGCPAttributes(input) != nil { + t.Fatalf("should return nil if the provider is not %s", api.AWS) + } + + input.Provider = api.AZURE + if flattenGCPAttributes(input) != nil { + t.Fatalf("should return nil if the provider is not %s", api.AWS) + } + + input.Provider = "gcp" + if flattenGCPAttributes(input) != nil { + t.Fatal("cloud provider should be always capital") + } +} + +func TestFlattenGCPDiskEncryption(t *testing.T) { + input := []*api.GCPDiskEncryption{ + {}, + { + CustomerManagedKey: "my-kms-key", + }, + nil, + } + + expected := [][]map[string]interface{}{ + { + map[string]interface{}{ + "customer_managed_encryption_key": "", + }, + }, + { + map[string]interface{}{ + "customer_managed_encryption_key": "my-kms-key", + }, + }, + {}, + } + + for i := range input { + output := flattenGCPDiskEncryption(input[i]) + if !reflect.DeepEqual(expected[i], output) { + t.Fatalf("error while matching[%d]:\nexpected %#v \nbut got %#v", i, expected[i], output) + } + } +} diff --git a/hopsworksai/provider.go b/hopsworksai/provider.go index 0215043..c9556b0 100644 --- a/hopsworksai/provider.go +++ b/hopsworksai/provider.go @@ -78,15 +78,16 @@ func Provider(version string) func() *schema.Provider { }, }, DataSourcesMap: map[string]*schema.Resource{ - "hopsworksai_cluster": dataSourceCluster(), - "hopsworksai_clusters": dataSourceClusters(), - "hopsworksai_instance_type": dataSourceInstanceType(), - "hopsworksai_instance_types": dataSourceInstanceTypes(), - "hopsworksai_aws_instance_profile_policy": dataSourceAWSInstanceProfilePolicy(), - "hopsworksai_azure_user_assigned_identity_permissions": dataSourceAzureUserAssignedIdentityPermissions(), - "hopsworksai_backups": dataSourceBackups(), - "hopsworksai_backup": dataSourceBackup(), - "hopsworksai_version": dataSourceVersion(), + "hopsworksai_cluster": dataSourceCluster(), + "hopsworksai_clusters": dataSourceClusters(), + "hopsworksai_instance_type": dataSourceInstanceType(), + "hopsworksai_instance_types": dataSourceInstanceTypes(), + "hopsworksai_aws_instance_profile_policy": dataSourceAWSInstanceProfilePolicy(), + "hopsworksai_azure_user_assigned_identity_permissions": dataSourceAzureUserAssignedIdentityPermissions(), + "hopsworksai_backups": dataSourceBackups(), + "hopsworksai_backup": dataSourceBackup(), + "hopsworksai_version": dataSourceVersion(), + "hopsworksai_gcp_service_account_custom_role_permissions": dataSourceGCPServiceAccountCustomRolePermissions(), }, ResourcesMap: map[string]*schema.Resource{ "hopsworksai_cluster": clusterResource(), diff --git a/hopsworksai/resource_cluster.go b/hopsworksai/resource_cluster.go index 0adeaa5..b89dcc8 100644 --- a/hopsworksai/resource_cluster.go +++ b/hopsworksai/resource_cluster.go @@ -305,7 +305,7 @@ func clusterSchema() map[string]*schema.Schema { ForceNew: true, MaxItems: 1, Elem: awsAttributesSchema(), - ExactlyOneOf: []string{"aws_attributes", "azure_attributes"}, + ExactlyOneOf: []string{"aws_attributes", "azure_attributes", "gcp_attributes"}, }, "azure_attributes": { Description: "The configurations required to run the cluster on Microsoft Azure.", @@ -314,7 +314,16 @@ func clusterSchema() map[string]*schema.Schema { ForceNew: true, MaxItems: 1, Elem: azureAttributesSchema(), - ExactlyOneOf: []string{"aws_attributes", "azure_attributes"}, + ExactlyOneOf: []string{"aws_attributes", "azure_attributes", "gcp_attributes"}, + }, + "gcp_attributes": { + Description: "The configurations required to run the cluster on Google GCP.", + Type: schema.TypeList, + Optional: true, + ForceNew: true, + MaxItems: 1, + Elem: gcpAttributesSchema(), + ExactlyOneOf: []string{"aws_attributes", "azure_attributes", "gcp_attributes"}, }, "open_ports": { Description: "Open the required ports to communicate with one of the Hopsworks services.", @@ -1070,6 +1079,103 @@ func azureAttributesSchema() *schema.Resource { } } +func gcpAttributesSchema() *schema.Resource { + return &schema.Resource{ + Schema: map[string]*schema.Schema{ + "project_id": { + Description: "The GCP project where the cluster will be created.", + Type: schema.TypeString, + Required: true, + ForceNew: true, + }, + "region": { + Description: "The GCP region where the cluster will be created.", + Type: schema.TypeString, + Required: true, + ForceNew: true, + }, + "zone": { + Description: "The GCP region where the cluster will be created.", + Type: schema.TypeString, + Required: true, + ForceNew: true, + }, + "bucket": { + Description: "The bucket configurations.", + Type: schema.TypeList, + Optional: true, + Computed: true, + ForceNew: true, + MaxItems: 1, + Elem: &schema.Resource{ + Schema: map[string]*schema.Schema{ + "name": { + Description: "The name of the GCP storage bucket that the cluster will use to store data in.", + Type: schema.TypeString, + Required: true, + ForceNew: true, + }, + }, + }, + }, + "service_account_email": { + Description: "The service account email address that the cluster will be started with.", + Type: schema.TypeString, + Required: true, + ForceNew: true, + }, + "network": { + Description: "The network configurations.", + Type: schema.TypeList, + Optional: true, + Computed: true, + ForceNew: true, + MaxItems: 1, + Elem: &schema.Resource{ + Schema: map[string]*schema.Schema{ + "network_name": { + Description: "The network name.", + Type: schema.TypeString, + Required: true, + ForceNew: true, + }, + "subnetwork_name": { + Description: "The subnetwork name.", + Type: schema.TypeString, + Required: true, + ForceNew: true, + }, + }, + }, + }, + "gke_cluster_name": { + Description: "The name of the Google GKE cluster.", + Type: schema.TypeString, + Optional: true, + ForceNew: true, + }, + "disk_encryption": { + Description: "The disk encryption configuration.", + Type: schema.TypeList, + Optional: true, + ForceNew: true, + Computed: true, + MaxItems: 1, + Elem: &schema.Resource{ + Schema: map[string]*schema.Schema{ + "customer_managed_encryption_key": { + Description: "Specify a customer-managed encryption key to be used for encryption of local storage. The key has to use the format: projects/PROJECT_ID/locations/REGION/keyRings/KEY_RING/cryptoKeys/KEY.", + Type: schema.TypeString, + Optional: true, + ForceNew: true, + }, + }, + }, + }, + }, + } +} + func clusterResource() *schema.Resource { return &schema.Resource{ Description: "Use this resource to create, read, update, and delete clusters on Hopsworks.ai.", @@ -1119,6 +1225,13 @@ func resourceClusterCreate(ctx context.Context, d *schema.ResourceData, meta int } } + if gcp, ok := d.GetOk("gcp_attributes"); ok { + gcpAttributes := gcp.([]interface{}) + if len(gcpAttributes) > 0 { + createRequest = createGCPCluster(d, baseRequest) + } + } + if createRequest == nil { return diag.Errorf("no request to create cluster") } @@ -1288,6 +1401,36 @@ func createAzureCluster(d *schema.ResourceData, baseRequest *api.CreateCluster) return &req, nil } +func createGCPCluster(d *schema.ResourceData, baseRequest *api.CreateCluster) *api.CreateGCPCluster { + req := api.CreateGCPCluster{ + CreateCluster: *baseRequest, + GCPCluster: api.GCPCluster{ + Project: d.Get("gcp_attributes.0.project_id").(string), + Region: d.Get("gcp_attributes.0.region").(string), + Zone: d.Get("gcp_attributes.0.zone").(string), + ServiceAccountEmail: d.Get("gcp_attributes.0.service_account_email").(string), + BucketName: d.Get("gcp_attributes.0.bucket.0.name").(string), + }, + } + + if _, ok := d.GetOk("gcp_attributes.0.network"); ok { + req.NetworkName = d.Get("gcp_attributes.0.network.0.network_name").(string) + req.SubNetworkName = d.Get("gcp_attributes.0.network.0.subnetwork_name").(string) + } + + if v, ok := d.GetOk("gcp_attributes.0.gke_cluster_name"); ok { + req.GkeClusterName = v.(string) + } + + if v, oke := d.GetOk("gcp_attributes.0.disk_encryption.0.customer_managed_encryption_key"); oke { + req.DiskEncryption = &api.GCPDiskEncryption{ + CustomerManagedKey: v.(string), + } + } + + return &req +} + func createClusterBaseRequest(d *schema.ResourceData) (*api.CreateCluster, error) { headConfig := d.Get("head").([]interface{})[0].(map[string]interface{}) @@ -1316,7 +1459,7 @@ func createClusterBaseRequest(d *schema.ResourceData) (*api.CreateCluster, error if v, ok := d.GetOk("ssh_key"); ok { createCluster.SshKeyName = v.(string) } else { - if _, ok := d.GetOk("aws_attributes"); !ok { + if _, ok := d.GetOk("azure_attributes"); ok { return nil, fmt.Errorf("SSH key is required") } } diff --git a/hopsworksai/resource_cluster_from_backup.go b/hopsworksai/resource_cluster_from_backup.go index 33c77a5..eaa467b 100644 --- a/hopsworksai/resource_cluster_from_backup.go +++ b/hopsworksai/resource_cluster_from_backup.go @@ -2,6 +2,7 @@ package hopsworksai import ( "context" + "strings" "time" "github.com/hashicorp/terraform-plugin-sdk/v2/diag" @@ -37,7 +38,7 @@ func clusterFromBackupResource() *schema.Resource { baseSchema["aws_attributes"].Optional = true baseSchema["aws_attributes"].ForceNew = true baseSchema["aws_attributes"].MaxItems = 1 - baseSchema["aws_attributes"].ConflictsWith = []string{"azure_attributes"} + baseSchema["aws_attributes"].ConflictsWith = []string{"azure_attributes", "gcp_attributes"} clusterAWSAttributesSchema := baseSchema["aws_attributes"].Elem.(*schema.Resource).Schema clusterAWSAttributesSchema["instance_profile_arn"].Optional = true @@ -50,11 +51,22 @@ func clusterFromBackupResource() *schema.Resource { baseSchema["azure_attributes"].Optional = true baseSchema["azure_attributes"].ForceNew = true baseSchema["azure_attributes"].MaxItems = 1 - baseSchema["azure_attributes"].ConflictsWith = []string{"aws_attributes"} + baseSchema["azure_attributes"].ConflictsWith = []string{"aws_attributes", "gcp_attributes"} clusterAZUREAttributesSchema := baseSchema["azure_attributes"].Elem.(*schema.Resource).Schema clusterAZUREAttributesSchema["network"] = azureAttributesSchema().Schema["network"] + // allow changing gcp + baseSchema["gcp_attributes"].Optional = true + baseSchema["gcp_attributes"].ForceNew = true + baseSchema["gcp_attributes"].MaxItems = 1 + baseSchema["gcp_attributes"].ConflictsWith = []string{"aws_attributes", "azure_attributes"} + + clusterGCPAttributesSchema := baseSchema["gcp_attributes"].Elem.(*schema.Resource).Schema + clusterGCPAttributesSchema["service_account_email"].Optional = true + clusterGCPAttributesSchema["service_account_email"].ForceNew = true + clusterGCPAttributesSchema["network"] = gcpAttributesSchema().Schema["network"] + // allow the following attributes to be updated later after creation baseSchema["update_state"] = clusterResourceSchema["update_state"] baseSchema["open_ports"] = clusterResourceSchema["open_ports"] @@ -129,6 +141,10 @@ func resourceClusterFromBackupCreate(ctx context.Context, d *schema.ResourceData restoreRequest = &api.CreateAzureClusterFromBackup{ CreateClusterFromBackup: baseRequest, } + case api.GCP: + restoreRequest = &api.CreateGCPClusterFromBackup{ + CreateClusterFromBackup: baseRequest, + } default: return diag.Errorf("Unknown cloud provider %s for backup %s", backup.CloudProvider, backupId) } @@ -155,7 +171,7 @@ func resourceClusterFromBackupCreate(ctx context.Context, d *schema.ResourceData awsRequest.SecurityGroupId = v.(string) } } else { - return diag.Errorf("incompatible cloud configuration, expected azure_attributes instead") + return diag.Errorf("incompatible cloud configuration, expected %s_attributes instead", strings.ToLower(backup.CloudProvider.String())) } } @@ -177,7 +193,25 @@ func resourceClusterFromBackupCreate(ctx context.Context, d *schema.ResourceData azureRequest.SecurityGroupName = v.(string) } } else { - return diag.Errorf("incompatible cloud configuration, expected aws_attributes instead") + return diag.Errorf("incompatible cloud configuration, expected %s_attributes instead", strings.ToLower(backup.CloudProvider.String())) + } + } + + if gcp, ok := d.GetOk("gcp_attributes"); ok && len(gcp.([]interface{})) > 0 { + if gcpRequest, okV := restoreRequest.(*api.CreateGCPClusterFromBackup); okV { + if v, ok := d.GetOk("gcp_attributes.0.service_account_email"); ok { + gcpRequest.ServiceAccountEmail = v.(string) + } + + if v, ok := d.GetOk("gcp_attributes.0.network.0.network_name"); ok { + gcpRequest.NetworkName = v.(string) + } + + if v, ok := d.GetOk("gcp_attributes.0.network.0.subnetwork_name"); ok { + gcpRequest.SubNetworkName = v.(string) + } + } else { + return diag.Errorf("incompatible cloud configuration, expected %s_attributes instead", strings.ToLower(backup.CloudProvider.String())) } } diff --git a/hopsworksai/resource_cluster_from_backup_test.go b/hopsworksai/resource_cluster_from_backup_test.go index 55d656e..97f99f0 100644 --- a/hopsworksai/resource_cluster_from_backup_test.go +++ b/hopsworksai/resource_cluster_from_backup_test.go @@ -379,6 +379,53 @@ func TestClusterFromBackupCreate_AZURE_incompatibleConfig(t *testing.T) { r.Apply(t, context.TODO()) } +func TestClusterFromBackupCreate_GCP_incompatibleConfig(t *testing.T) { + t.Parallel() + r := test.ResourceFixture{ + HttpOps: []test.Operation{ + { + Method: http.MethodGet, + Path: "/api/backups/backup-id-1", + Response: `{ + "apiVersion": "v1", + "status": "ok", + "code": 200, + "payload":{ + "backup": { + "backupId" : "backup-id-1", + "backupName": "backup-1", + "clusterId": "cluster-id-1", + "cloudProvider": "GCP", + "createdOn": 100, + "state": "succeed", + "stateMessage": "backup completed" + } + } + }`, + }, + }, + Resource: clusterFromBackupResource(), + OperationContextFunc: clusterFromBackupResource().CreateContext, + State: map[string]interface{}{ + "source_backup_id": "backup-id-1", + "azure_attributes": []interface{}{ + map[string]interface{}{ + "network": []interface{}{ + map[string]interface{}{ + "resource_group": "resource-group-1", + "virtual_network_name": "virtual-network-name-1", + "subnet_name": "subnet-name-1", + "security_group_name": "security-group-name-1", + }, + }, + }, + }, + }, + ExpectError: "incompatible cloud configuration, expected gcp_attributes instead", + } + r.Apply(t, context.TODO()) +} + func testClusterFromBackupCreate_update(t *testing.T, cloudProvider api.CloudProvider, expectedReqBody string, state map[string]interface{}) { state["source_backup_id"] = "backup-id-1" r := test.ResourceFixture{ @@ -717,3 +764,63 @@ func TestClusterFromBackupCreate_backup_notfound(t *testing.T) { } r.Apply(t, context.TODO()) } + +func TestClusterFromBackupCreate_GCP_update(t *testing.T) { + t.Parallel() + testClusterFromBackupCreate_update(t, api.GCP, `{ + "cluster": { + "name": "new-cluster-name", + "sshKeyName": "new-ssh-key", + "tags": [ + { + "name": "tag1", + "value": "tag1-value" + } + ], + "autoscale": { + "nonGpu": { + "instanceType": "non-gpu-node", + "diskSize": 100, + "minWorkers": 0, + "maxWorkers": 10, + "standbyWorkers": 0.5, + "downscaleWaitTime": 200 + } + }, + "serviceAccountEmail": "service@account.ai", + "networkName": "network-name-1", + "subNetworkName": "sub-name-1" + } + }`, map[string]interface{}{ + "name": "new-cluster-name", + "ssh_key": "new-ssh-key", + "tags": map[string]interface{}{ + "tag1": "tag1-value", + }, + "autoscale": []interface{}{ + map[string]interface{}{ + "non_gpu_workers": []interface{}{ + map[string]interface{}{ + "instance_type": "non-gpu-node", + "disk_size": 100, + "min_workers": 0, + "max_workers": 10, + "standby_workers": 0.5, + "downscale_wait_time": 200, + }, + }, + }, + }, + "gcp_attributes": []interface{}{ + map[string]interface{}{ + "service_account_email": "service@account.ai", + "network": []interface{}{ + map[string]interface{}{ + "network_name": "network-name-1", + "subnetwork_name": "sub-name-1", + }, + }, + }, + }, + }) +} diff --git a/hopsworksai/resource_cluster_test.go b/hopsworksai/resource_cluster_test.go index f934ce2..6fd97b4 100644 --- a/hopsworksai/resource_cluster_test.go +++ b/hopsworksai/resource_cluster_test.go @@ -6315,3 +6315,363 @@ func TestClusterCreate_withArrowFlight(t *testing.T) { } r.Apply(t, context.TODO()) } + +func TestClusterCreate_GCP(t *testing.T) { + t.Parallel() + r := test.ResourceFixture{ + HttpOps: []test.Operation{ + { + Method: http.MethodPost, + Path: "/api/clusters", + Response: `{ + "apiVersion": "v1", + "status": "ok", + "code": 200, + "payload":{ + "id" : "new-cluster-id-1" + } + }`, + CheckRequestBody: func(reqBody io.Reader) error { + var req api.NewGCPClusterRequest + if err := json.NewDecoder(reqBody).Decode(&req); err != nil { + return err + } + expected := api.GCPCluster{ + Project: "project-1", + Region: "region-1", + Zone: "zone-1", + BucketName: "bucket-1", + ServiceAccountEmail: "email@iam.com", + } + if !reflect.DeepEqual(expected, req.CreateRequest.GCPCluster) { + return fmt.Errorf("error while matching:\nexpected %#v \nbut got %#v", expected, req.CreateRequest.GCPCluster) + } + return nil + }, + }, + { + Method: http.MethodGet, + Path: "/api/clusters/new-cluster-id-1", + Response: `{ + "apiVersion": "v1", + "status": "ok", + "code": 200, + "payload":{ + "cluster": { + "id" : "new-cluster-id-1", + "state": "running" + } + } + }`, + }, + { + Method: http.MethodPost, + Path: "/api/clusters/new-cluster-id-1/ports", + Response: `{ + "apiVersion": "v1", + "status": "ok", + "code": 200 + }`, + }, + }, + Resource: clusterResource(), + OperationContextFunc: clusterResource().CreateContext, + State: map[string]interface{}{ + "name": "cluster-1", + "version": "2.0", + "head": []interface{}{ + map[string]interface{}{ + "instance_type": "node-type-1", + "disk_size": 512, + }, + }, + "workers": []interface{}{ + map[string]interface{}{ + "instance_type": "node-type-2", + "disk_size": 256, + "count": 2, + }, + }, + "ssh_key": "ssh-key-1", + "tags": map[string]interface{}{ + "tag1": "tag1-value1", + }, + "gcp_attributes": []interface{}{ + map[string]interface{}{ + "project_id": "project-1", + "region": "region-1", + "zone": "zone-1", + "service_account_email": "email@iam.com", + "bucket": []interface{}{ + map[string]interface{}{ + "name": "bucket-1", + }, + }, + }, + }, + "open_ports": []interface{}{ + map[string]interface{}{ + "ssh": true, + "kafka": true, + "feature_store": true, + "online_feature_store": true, + }, + }, + }, + ExpectId: "new-cluster-id-1", + } + r.Apply(t, context.TODO()) +} + +func TestClusterCreate_GCP_setAll(t *testing.T) { + t.Parallel() + r := test.ResourceFixture{ + HttpOps: []test.Operation{ + { + Method: http.MethodPost, + Path: "/api/clusters", + Response: `{ + "apiVersion": "v1", + "status": "ok", + "code": 200, + "payload":{ + "id" : "new-cluster-id-1" + } + }`, + CheckRequestBody: func(reqBody io.Reader) error { + var req api.NewGCPClusterRequest + if err := json.NewDecoder(reqBody).Decode(&req); err != nil { + return err + } + expected := api.GCPCluster{ + Project: "project-1", + Region: "region-1", + Zone: "zone-1", + BucketName: "bucket-1", + ServiceAccountEmail: "email@iam.com", + NetworkName: "network-1", + SubNetworkName: "sub-1", + GkeClusterName: "cluster-1", + DiskEncryption: &api.GCPDiskEncryption{ + CustomerManagedKey: "key-1", + }, + } + if !reflect.DeepEqual(expected, req.CreateRequest.GCPCluster) { + return fmt.Errorf("error while matching:\nexpected %#v \nbut got %#v", expected, req.CreateRequest.GCPCluster) + } + return nil + }, + }, + { + Method: http.MethodGet, + Path: "/api/clusters/new-cluster-id-1", + Response: `{ + "apiVersion": "v1", + "status": "ok", + "code": 200, + "payload":{ + "cluster": { + "id" : "new-cluster-id-1", + "state": "running" + } + } + }`, + }, + { + Method: http.MethodPost, + Path: "/api/clusters/new-cluster-id-1/ports", + Response: `{ + "apiVersion": "v1", + "status": "ok", + "code": 200 + }`, + }, + }, + Resource: clusterResource(), + OperationContextFunc: clusterResource().CreateContext, + State: map[string]interface{}{ + "name": "cluster-1", + "version": "2.0", + "head": []interface{}{ + map[string]interface{}{ + "instance_type": "node-type-1", + "disk_size": 512, + }, + }, + "workers": []interface{}{ + map[string]interface{}{ + "instance_type": "node-type-2", + "disk_size": 256, + "count": 2, + }, + }, + "ssh_key": "ssh-key-1", + "tags": map[string]interface{}{ + "tag1": "tag1-value1", + }, + "gcp_attributes": []interface{}{ + map[string]interface{}{ + "project_id": "project-1", + "region": "region-1", + "zone": "zone-1", + "service_account_email": "email@iam.com", + "bucket": []interface{}{ + map[string]interface{}{ + "name": "bucket-1", + }, + }, + "network": []interface{}{ + map[string]interface{}{ + "network_name": "network-1", + "subnetwork_name": "sub-1", + }, + }, + "gke_cluster_name": "cluster-1", + "disk_encryption": []interface{}{ + map[string]interface{}{ + "customer_managed_encryption_key": "key-1", + }, + }, + }, + }, + "open_ports": []interface{}{ + map[string]interface{}{ + "ssh": true, + "kafka": true, + "feature_store": true, + "online_feature_store": true, + }, + }, + }, + ExpectId: "new-cluster-id-1", + } + r.Apply(t, context.TODO()) +} + +func TestClusterRead_GCP(t *testing.T) { + r := test.ResourceFixture{ + HttpOps: []test.Operation{ + { + Method: http.MethodGet, + Path: "/api/clusters/cluster-id-1", + Response: `{ + "apiVersion": "v1", + "statue": "ok", + "code": 200, + "payload":{ + "cluster": { + "id": "cluster-id-1", + "name": "cluster-name-1", + "state" : "running", + "activationState": "stoppable", + "initializationStage": "running", + "createdOn": 123, + "startedOn" : 123, + "version": "version-1", + "url": "https://cluster-url", + "provider": "GCP", + "tags": [ + { + "name": "tag1", + "value": "tag1-value1" + } + ], + "sshKeyName": "ssh-key-1", + "clusterConfiguration": { + "head": { + "instanceType": "node-type-1", + "diskSize": 512, + "nodeId": "head-node-id-1", + "privateIp": "ip1" + }, + "workers": [ + { + "instanceType": "node-type-2", + "diskSize": 256, + "count": 2, + "privateIps": ["ip2","ip3"] + } + ] + }, + "publicIPAttached": true, + "letsEncryptIssued": true, + "managedUsers": true, + "backupRetentionPeriod": 10, + "gcp": { + "project": "project-1", + "region": "region-1", + "zone": "zone-1", + "bucketName": "bucket-1", + "serviceAccountEmail": "email@iam.com", + "networkName": "network-1", + "subNetworkName": "sub-1" + } + } + } + }`, + }, + }, + Resource: clusterResource(), + OperationContextFunc: clusterResource().ReadContext, + Id: "cluster-id-1", + ExpectState: map[string]interface{}{ + "cluster_id": "cluster-id-1", + "state": "running", + "activation_state": "stoppable", + "creation_date": time.Unix(123, 0).Format(time.RFC3339), + "start_date": time.Unix(123, 0).Format(time.RFC3339), + "version": "version-1", + "url": "https://cluster-url", + "tags": map[string]interface{}{ + "tag1": "tag1-value1", + }, + "ssh_key": "ssh-key-1", + "head": []interface{}{ + map[string]interface{}{ + "instance_type": "node-type-1", + "disk_size": 512, + "node_id": "head-node-id-1", + "ha_enabled": false, + "private_ip": "ip1", + }, + }, + "workers": schema.NewSet(helpers.WorkerSetHash, []interface{}{ + map[string]interface{}{ + "instance_type": "node-type-2", + "disk_size": 256, + "count": 2, + "spot_config": []interface{}{}, + "private_ips": []interface{}{"ip2", "ip3"}, + }, + }), + "attach_public_ip": true, + "issue_lets_encrypt_certificate": true, + "managed_users": true, + "backup_retention_period": 10, + "gcp_attributes": []interface{}{ + map[string]interface{}{ + "project_id": "project-1", + "region": "region-1", + "zone": "zone-1", + "service_account_email": "email@iam.com", + "network": []interface{}{ + map[string]interface{}{ + "network_name": "network-1", + "subnetwork_name": "sub-1", + }, + }, + "gke_cluster_name": "", + "bucket": []interface{}{ + map[string]interface{}{ + "name": "bucket-1", + }, + }, + "disk_encryption": []interface{}{}, + }, + }, + "azure_attributes": []interface{}{}, + "aws_attributes": []interface{}{}, + }, + } + r.Apply(t, context.TODO()) +} diff --git a/templates/index.md.tmpl b/templates/index.md.tmpl index 7bc4291..98e3877 100644 --- a/templates/index.md.tmpl +++ b/templates/index.md.tmpl @@ -12,6 +12,7 @@ The Hopsworksai terraform provider is used to interact with [Hopsworks.ai](https If you are new to Hopsworks, then first you need to create an account on [Hopsworks.ai](https://managed.hopsworks.ai), and then you can follow one of the getting started guides to connect either your AWS account or Azure account to create your own Hopsworks clusters. * [Getting Started with AWS](https://docs.hopsworks.ai/latest/setup_installation/aws/getting_started/) * [Getting Started with Azure](https://docs.hopsworks.ai/latest/setup_installation/azure/getting_started/) + * [Getting Started with GCP](https://docs.hopsworks.ai/latest/setup_installation/gcp/getting_started/) -> A Hopsworks API Key is required to allow the provider to manage clusters on Hopsworks.ai on your behalf. To create an API Key, follow [this guide](https://docs.hopsworks.ai/latest/setup_installation/common/api_key). @@ -22,7 +23,7 @@ In the following sections, we show two usage examples to create Hopsworks cluste Hopsworks.ai deploys Hopsworks clusters to your AWS account using the permissions provided during [account setup](https://docs.hopsworks.ai/latest/setup_installation/aws/getting_started/#step-1-connecting-your-aws-account). To create a Hopsworks cluster, you will need to create an empty S3 bucket, an ssh key, and an instance profile with the required [Hopsworks permissions](https://docs.hopsworks.ai/latest/setup_installation/aws/getting_started/#step-2-creating-instance-profile). -If you have already created these 3 resources, you can skip the first step in the following terraform example and instead fill the corresponding attributes in Step 2 (*bucket_name*, *ssh_key*, *instance_profile_arn*) with your configuration. +If you have already created these 3 resources, you can skip the first step in the following terraform example and instead fill the corresponding attributes in Step 2 (*bucket/name*, *ssh_key*, *instance_profile_arn*) with your configuration. Otherwise, you need to setup the credentials for your AWS account locally as described [here](https://registry.terraform.io/providers/hashicorp/aws/latest/docs), then you can run the following terraform example which creates the required AWS resources and a Hopsworks cluster. {{tffile "examples/provider/provider_aws.tf"}} @@ -38,4 +39,14 @@ Notice that you need to replace "*YOUR AZURE RESOURCE GROUP*" with the resource {{tffile "examples/provider/provider_azure.tf"}} +## GCP Example Usage + +Similar to AWS and AZURE, Hopsworks.ai deploys Hopsworks clusters to your GCP project using the permissions provided during [account setup](https://docs.hopsworks.ai/latest/setup_installation/gcp/getting_started/#step-1-connecting-your-gcp-account). +To create a Hopsworks cluster, you will need to create a storage bucket and a service account with the required [Hopsworks permissions](https://docs.hopsworks.ai/latest/setup_installation/gcp/getting_started/#step-3-creating-a-service-account-for-your-cluster-instances) +If you have already created these 2 resources, you can skip the first step in the following terraform example and instead fill the corresponding attributes in Step 2 (*service_account_email*, *bucket/name*) with your configuration. +Otherwise, you need to setup the credentials for your Google account locally as described [here](https://registry.terraform.io/providers/hashicorp/google/latest/docs), then you can run the following terraform example which creates the required Google resources and a Hopsworks cluster. + + +{{tffile "examples/provider/provider_gcp.tf"}} + {{ .SchemaMarkdown | trimspace }} \ No newline at end of file