Skip to content

Commit

Permalink
Solving issues with IAM member roles attribution (#251)
Browse files Browse the repository at this point in the history
* predicting for only the users with traffic in the past 72h - purchase propensity

* running inference only for users events in the past 72h

* including 72h users for all models predictions

* considering null values in TabWorkflow models

* deleting unused pipfile

* upgrading lib versions

* implementing reporting preprocessing as a new pipeline

* adding more code documentation

* adding important information on the main README.md and DEVELOPMENT.md

* adding schedule run name and more code documentation

* implementing a new scheduler using the vertex ai sdk & adding user_id to procedures for consistency

* adding more code documentation

* adding code doc to the python custom component

* adding more code documentation

* fixing aggregated predictions query

* removing unnecessary resources from deployment

* Writing MDS guide

* adding the MDS developer and troubleshooting documentation

* fixing deployment for activation pipelines and gemini dataset

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* removing deprecated api

* fixing purchase propensity pipelines names

* adding extra condition for when there is not enough data for the window interval to be applied on backfill procedures

* adding more instructions for post deployment and fixing issues when GA4 export was configured for less than 10 days

* removing unnecessary comments

* adding the number of past days to process in the variables files

* adding comment about combining data from different ga4 export datasets to data store

* fixing small issues with feature engineering and ml pipelines

* fixing hyper parameter tuning for kmeans modeling

* fixing optuna parameters

* adding cloud shell image

* fixing the list of all possible users in the propensity training preparation tables

* additional guardrails for when there is not enough data

* adding more documentation

* adding more doc to feature store

* add feature store documentation

* adding ml pipelines docs

* adding ml pipelines docs

* adding more documentation

* adding user agent client info

* fixing scope of client info

* fix

* removing client_info from vertex components

* fixing versioning of tf submodules

* reconfiguring meta providers

* fixing issue 187

* chore(deps): upgrade terraform providers and modules version

* chore(deps): set the provider version

* chore: formatting

* fix: brand naming

* fix: typo

* fixing secrets issue

* implementing secrets region as tf variable

* implementing secrets region as tf variable

* last changes requested by lgrangeau

* documenting keys location better

* implementing vpc peering network

* Update README.md

* Rebase Main into Multi-property (#243)

* Update README.md

* ensure the build bucket is created in the specified region (#230)

* Update audience_segmentation_query_template.sqlx

* Update auto_audience_segmentation_query_template.sqlx

* Update churn_propensity_query_template.sqlx

* Update cltv_query_template.sqlx

* Update purchase_propensity_query_template.sqlx

* Restrict regions for GCP Cloud Build support (#241)

* Update README.md

* Move to uv (#242)

* add uv required project table segment in toml file

* switch to uv in terraform deployment

* switch to uv

* remove poetry usage from terraform

* format

* remove poetry

* Add files via upload

---------

Co-authored-by: Charlie Wang <[email protected]>
Co-authored-by: Mårten Lindblad <[email protected]>

* supporting property id in the resources

* fixing iam member roles issues

---------

Co-authored-by: Carlos Timoteo <[email protected]>
Co-authored-by: Laurent Grangeau <[email protected]>
Co-authored-by: Charlie Wang <[email protected]>
Co-authored-by: Mårten Lindblad <[email protected]>
  • Loading branch information
5 people authored Nov 22, 2024
1 parent fdad6d1 commit 74756a7
Show file tree
Hide file tree
Showing 2 changed files with 178 additions and 47 deletions.
19 changes: 19 additions & 0 deletions infrastructure/terraform/.terraform.lock.hcl

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

206 changes: 159 additions & 47 deletions infrastructure/terraform/modules/data-store/iam-binding.tf
Original file line number Diff line number Diff line change
Expand Up @@ -12,18 +12,6 @@
# See the License for the specific language governing permissions and
# limitations under the License.

# TODO: we might not need to have this email role at all.
resource "google_project_iam_member" "email-role" {
for_each = toset([
"roles/iam.serviceAccountUser", // TODO: is it really needed?
"roles/dataform.admin",
"roles/dataform.editor"
])
role = each.key
member = "user:${var.project_owner_email}"
project = null_resource.check_dataform_api.id != "" ? module.data_processing_project_services.project_id : data.google_project.data_processing.project_id
}

# Check the Dataform Service Account Access Requirements for more information
# https://cloud.google.com/dataform/docs/required-access
locals {
Expand All @@ -38,15 +26,15 @@ resource "null_resource" "wait_for_dataform_sa_creation" {
MAX_TRIES=100
while ! gcloud asset search-all-iam-policies --scope=projects/${module.data_processing_project_services.project_id} --flatten="policy.bindings[].members[]" --filter="policy.bindings.members~\"serviceAccount:\"" --format="value(policy.bindings.members.split(sep=\":\").slice(1))" | grep -i "${local.dataform_sa}" && [ $COUNTER -lt $MAX_TRIES ]
do
sleep 3
sleep 10
printf "."
COUNTER=$((COUNTER + 1))
done
if [ $COUNTER -eq $MAX_TRIES ]; then
echo "dataform service account was not created, terraform can not continue!"
exit 1
fi
sleep 20
sleep 120
EOT
}

Expand All @@ -56,61 +44,185 @@ resource "null_resource" "wait_for_dataform_sa_creation" {
]
}

module "email-role" {
source = "terraform-google-modules/iam/google//modules/member_iam"
version = "~> 8.0"

service_account_address = var.project_owner_email
project_id = null_resource.check_dataform_api.id != "" ? module.data_processing_project_services.project_id : data.google_project.data_processing.project_id
project_roles = [
"roles/iam.serviceAccountUser", // TODO: is it really needed?
"roles/dataform.admin",
"roles/dataform.editor"
]
prefix = "user"
}
#resource "google_project_iam_member" "email-role" {
# for_each = toset([
# "roles/iam.serviceAccountUser", // TODO: is it really needed?
# "roles/dataform.admin",
# "roles/dataform.editor"
# ])
# role = each.key
# member = "user:${var.project_owner_email}"
# project = null_resource.check_dataform_api.id != "" ? module.data_processing_project_services.project_id : data.google_project.data_processing.project_id
#}

# Propagation time for change of access policy typically takes 2 minutes
# according to https://cloud.google.com/iam/docs/access-change-propagation
# this wait make sure the policy changes are propagated before proceeding
# with the build
resource "time_sleep" "wait_for_email_role_propagation" {
create_duration = "120s"
depends_on = [
module.email-role
]
}

# This resource sets the Dataform service account IAM member roles
resource "google_project_iam_member" "dataform-serviceaccount" {
module "dataform-serviceaccount" {
source = "terraform-google-modules/iam/google//modules/member_iam"
version = "~> 8.0"
depends_on = [
google_dataform_repository.marketing-analytics,
null_resource.check_dataform_api,
null_resource.wait_for_dataform_sa_creation
null_resource.wait_for_dataform_sa_creation,
time_sleep.wait_for_email_role_propagation
]
for_each = toset([
service_account_address = local.dataform_sa
project_id = null_resource.check_dataform_api.id != "" ? module.data_processing_project_services.project_id : data.google_project.data_processing.project_id
project_roles = [
"roles/secretmanager.secretAccessor",
"roles/bigquery.jobUser"
])
role = each.key
member = "serviceAccount:${local.dataform_sa}"
project = null_resource.check_dataform_api.id != "" ? module.data_processing_project_services.project_id : data.google_project.data_processing.project_id
"roles/bigquery.jobUser",
"roles/bigquery.dataOwner",
]
prefix = "serviceAccount"
}
# This resource sets the Dataform service account IAM member roles
#resource "google_project_iam_member" "dataform-serviceaccount" {
# depends_on = [
# google_dataform_repository.marketing-analytics,
# null_resource.check_dataform_api,
# null_resource.wait_for_dataform_sa_creation,
# time_sleep.wait_for_email_role_propagation
# ]
# for_each = toset([
# "roles/secretmanager.secretAccessor",
# "roles/bigquery.jobUser",
# "roles/bigquery.dataOwner",
# ])
# role = each.key
# member = "serviceAccount:${local.dataform_sa}"
# project = null_resource.check_dataform_api.id != "" ? module.data_processing_project_services.project_id : data.google_project.data_processing.project_id
#}

// Owner role to BigQuery in the destination data project the Dataform SA.
// Multiple datasets will be created; it requires project-level permissions
resource "google_project_iam_member" "dataform-bigquery-data-owner" {
# Propagation time for change of access policy typically takes 2 minutes
# according to https://cloud.google.com/iam/docs/access-change-propagation
# this wait make sure the policy changes are propagated before proceeding
# with the build
resource "time_sleep" "wait_for_dataform-serviceaccount_role_propagation" {
create_duration = "120s"
depends_on = [
google_dataform_repository.marketing-analytics,
null_resource.check_dataform_api,
null_resource.wait_for_dataform_sa_creation
module.dataform-serviceaccount
]
for_each = toset([
"roles/bigquery.dataOwner",
])
role = each.key
member = "serviceAccount:${local.dataform_sa}"
project = null_resource.check_dataform_api.id != "" ? module.data_processing_project_services.project_id : data.google_project.data_processing.project_id
}

// Read access to the GA4 exports
resource "google_bigquery_dataset_iam_member" "dataform-ga4-export-reader" {
module "dataform-ga4-export-reader" {
source = "terraform-google-modules/iam/google//modules/bigquery_datasets_iam"
version = "~> 8.0"
depends_on = [
google_dataform_repository.marketing-analytics,
null_resource.check_dataform_api,
null_resource.wait_for_dataform_sa_creation
null_resource.wait_for_dataform_sa_creation,
time_sleep.wait_for_dataform-serviceaccount_role_propagation
]
project = var.source_ga4_export_project_id
bigquery_datasets = [
var.source_ga4_export_dataset,
]
mode = "authoritative"

bindings = {
"roles/bigquery.dataViewer" = [
"serviceAccount:${local.dataform_sa}",
]
"roles/bigquery.dataEditor" = [
"serviceAccount:${local.dataform_sa}",
]
}
}
#resource "google_bigquery_dataset_iam_member" "dataform-ga4-export-reader" {
# depends_on = [
# google_dataform_repository.marketing-analytics,
# null_resource.check_dataform_api,
# null_resource.wait_for_dataform_sa_creation,
# time_sleep.wait_for_dataform-serviceaccount_role_propagation
# ]
# role = "roles/bigquery.dataViewer"
# member = "serviceAccount:${local.dataform_sa}"
# project = var.source_ga4_export_project_id
# dataset_id = var.source_ga4_export_dataset
#}

# Propagation time for change of access policy typically takes 2 minutes
# according to https://cloud.google.com/iam/docs/access-change-propagation
# this wait make sure the policy changes are propagated before proceeding
# with the build
resource "time_sleep" "wait_for_dataform-ga4-export-reader_role_propagation" {
create_duration = "120s"
depends_on = [
module.dataform-ga4-export-reader
]
role = "roles/bigquery.dataViewer"
member = "serviceAccount:${local.dataform_sa}"
project = var.source_ga4_export_project_id
dataset_id = var.source_ga4_export_dataset
}

// Read access to the Ads datasets
resource "google_bigquery_dataset_iam_member" "dataform-ads-export-reader" {
module "dataform-ads-export-reader" {
source = "terraform-google-modules/iam/google//modules/bigquery_datasets_iam"
version = "~> 8.0"
depends_on = [
google_dataform_repository.marketing-analytics,
null_resource.check_dataform_api,
null_resource.wait_for_dataform_sa_creation
null_resource.wait_for_dataform_sa_creation,
time_sleep.wait_for_dataform-ga4-export-reader_role_propagation
]
count = length(var.source_ads_export_data)
project = var.source_ads_export_data[count.index].project
bigquery_datasets = [
var.source_ads_export_data[count.index].dataset,
]
mode = "authoritative"

bindings = {
"roles/bigquery.dataViewer" = [
"serviceAccount:${local.dataform_sa}",
]
"roles/bigquery.dataEditor" = [
"serviceAccount:${local.dataform_sa}",
]
}
}
#resource "google_bigquery_dataset_iam_member" "dataform-ads-export-reader" {
# depends_on = [
# google_dataform_repository.marketing-analytics,
# null_resource.check_dataform_api,
# null_resource.wait_for_dataform_sa_creation,
# time_sleep.wait_for_dataform-ga4-export-reader_role_propagation
# ]
# count = length(var.source_ads_export_data)
# role = "roles/bigquery.dataViewer"
# member = "serviceAccount:${local.dataform_sa}"
# project = var.source_ads_export_data[count.index].project
# dataset_id = var.source_ads_export_data[count.index].dataset
#}

# Propagation time for change of access policy typically takes 2 minutes
# according to https://cloud.google.com/iam/docs/access-change-propagation
# this wait make sure the policy changes are propagated before proceeding
# with the build
resource "time_sleep" "wait_for_dataform-ads-export-reader_role_propagation" {
create_duration = "120s"
depends_on = [
module.dataform-ads-export-reader
]
count = length(var.source_ads_export_data)
role = "roles/bigquery.dataViewer"
member = "serviceAccount:${local.dataform_sa}"
project = var.source_ads_export_data[count.index].project
dataset_id = var.source_ads_export_data[count.index].dataset
}

0 comments on commit 74756a7

Please sign in to comment.