Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add pudl usage metrics gcp infrastructure #3841

Merged
merged 4 commits into from
Sep 17, 2024
Merged
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 37 additions & 1 deletion terraform/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -228,7 +228,7 @@ resource "google_cloud_run_v2_service" "pudl-superset" {
volumes {
name = "cloudsql"
cloud_sql_instance {
instances = ["catalyst-cooperative-pudl:us-central1:superset-database"]
instances = ["catalyst-cooperative-pudl:us-central1:superset-database", "catalyst-cooperative-pudl:us-central1:pudl-usage-metrics-db"]
}
}
}
Expand Down Expand Up @@ -396,3 +396,39 @@ resource "google_service_account_iam_member" "gce-default-account-iam" {
role = "roles/iam.serviceAccountUser"
member = "serviceAccount:[email protected]"
}

resource "google_secret_manager_secret" "pudl_usage_metrics_db_connection_string" {
secret_id = "pudl-usage-metrics-db-connection-string"
replication {
auto {}
}
}
Comment on lines +400 to +405
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I figured I'd save the connection string in case we need to reconnect the db to superset.


resource "google_storage_bucket" "pudl_usage_metrics_archive_bucket" {
name = "pudl-usage-metrics-archives.catalyst.coop"
location = "US"
storage_class = "STANDARD"

uniform_bucket_level_access = true
}

resource "google_service_account" "usage_metrics_archiver" {
account_id = "usage-metrics-archiver"
display_name = "PUDL usage metrics archiver github action service account"
}
Comment on lines +415 to +418
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did create a service account key for the GitHub action in the business repo. @jdangerx would love a WIF tutorial soon!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've forgotten everything I know about WIF but could re-learn it!


resource "google_storage_bucket_iam_member" "usage_metrics_archiver_gcs_iam" {
for_each = toset(["roles/storage.objectCreator", "roles/storage.objectViewer"])

bucket = google_storage_bucket.pudl_usage_metrics_archive_bucket.name
role = each.key
member = "serviceAccount:${google_service_account.usage_metrics_archiver.email}"
}

resource "google_storage_bucket_iam_member" "usage_metrics_etl_gcs_iam" {
for_each = toset(["roles/storage.legacyBucketReader", "roles/storage.objectViewer"])
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't find a non legacy role that gives a principle the storage.buckets.get and storage.objects.get permissions. Seems like the GCS python client wants both to access objects in a bucket.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably fine. If we want to switch to non-legacy roles, it looks like we could give roles/storage.objectUser for objects.get and the confusingly named roles/storage.insightsCollectorService gets you buckets.get.


bucket = google_storage_bucket.pudl_usage_metrics_archive_bucket.name
role = each.key
member = "serviceAccount:pudl-usage-metrics-etl@catalyst-cooperative-pudl.iam.gserviceaccount.com"
}
Loading