Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rebase main into multi-property #256

Open
wants to merge 38 commits into
base: multi-property
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
38 commits
Select commit Hold shift + click to select a range
adde430
Update README.md
chmstimoteo Nov 6, 2024
bb82162
ensure the build bucket is created in the specified region (#230)
kingman Nov 9, 2024
425b83e
Update audience_segmentation_query_template.sqlx
chmstimoteo Nov 11, 2024
de5fa3d
Update auto_audience_segmentation_query_template.sqlx
chmstimoteo Nov 11, 2024
326a364
Update churn_propensity_query_template.sqlx
chmstimoteo Nov 11, 2024
da881fe
Update cltv_query_template.sqlx
chmstimoteo Nov 11, 2024
841c523
Update purchase_propensity_query_template.sqlx
chmstimoteo Nov 11, 2024
d5d84d7
Restrict regions for GCP Cloud Build support (#241)
martenlindblad Nov 14, 2024
639bbff
Update README.md
chmstimoteo Nov 14, 2024
ec5b667
Move to uv (#242)
kingman Nov 15, 2024
87917d7
Support property id in resources (#246)
chmstimoteo Nov 15, 2024
bc1fc2b
Update terraform-template.tfvars
chmstimoteo Nov 15, 2024
fdad6d1
Update setup.py
chmstimoteo Nov 15, 2024
74756a7
Solving issues with IAM member roles attribution (#251)
chmstimoteo Nov 22, 2024
b9b127f
Fixing issues with BQ to Vertex Service Account IAM member role (#252)
chmstimoteo Nov 22, 2024
1631315
Fixing issue with Vertex AI Model Connection on BigQuery (#253)
chmstimoteo Nov 22, 2024
f3d8179
Update bigquery-procedures.tf
chmstimoteo Nov 22, 2024
0311b7b
Update bigquery-procedures.tf
chmstimoteo Nov 22, 2024
44bf251
Update bigquery-procedures.tf
chmstimoteo Nov 22, 2024
6b7db8a
Update README.md
chmstimoteo Nov 25, 2024
8c21114
Update README.md
chmstimoteo Nov 25, 2024
7b9fb8c
Add files via upload
chmstimoteo Nov 26, 2024
92b585a
specify the columns in the backfill procedures, makes sure the script…
kingman Nov 26, 2024
e775f69
Add files via upload
chmstimoteo Nov 26, 2024
d528d17
Update README.md
chmstimoteo Nov 26, 2024
d707e78
Update README.md
chmstimoteo Nov 26, 2024
2b24996
Update invoke_churn_propensity_training_preparation.sqlx
chmstimoteo Dec 4, 2024
ea3be2e
Update invoke_purchase_propensity_training_preparation.sqlx
chmstimoteo Dec 4, 2024
fb3aac8
Update invoke_churn_propensity_training_preparation.sqlx
chmstimoteo Dec 4, 2024
d55624c
Update invoke_purchase_propensity_training_preparation.sqlx
chmstimoteo Dec 4, 2024
a4ca058
Update invoke_customer_lifetime_value_training_preparation.sqlx
chmstimoteo Dec 4, 2024
fda8762
Update invoke_purchase_propensity_training_preparation.sqlx
chmstimoteo Dec 9, 2024
5a7eb5a
Set pipeline state as terraform variables (#260)
martenlindblad Dec 9, 2024
5dec332
Reset variable reference for purchase propensity (#263)
martenlindblad Dec 9, 2024
3b98a4b
Update export-procedures.tf
chmstimoteo Dec 9, 2024
0631f34
Update config.yaml.tftpl
chmstimoteo Dec 9, 2024
d7ee88a
Update component.py
chmstimoteo Dec 13, 2024
4a6edcc
new view for aggregated stat on purchase propensity predictions (#268)
kingman Dec 16, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion DEVELOPMENT.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ Marketing Analytics Jumpstart consists of an easy, extensible and automated impl

## Developer pre-requisites
Use Visual Studio Code to develop the solution. Install Gemini Code Assistant, Docker, GitHub, Hashicorp, Terraform, Jinja extensions.
You should have Python 3, Poetry, Terraform, Git and Docker installed in your developer terminal environment.
You should have Python 3, uv, Terraform, Git and Docker installed in your developer terminal environment.

## Preparing development environment

Expand Down
11 changes: 9 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,15 @@
# Marketing Analytics Jumpstart
Marketing Analytics Jumpstart is a terraform automated, quick-to-deploy, customizable end-to-end marketing solution on Google Cloud Platform (GCP). This solution aims at helping customer better understand and better use their digital advertising budget.
Marketing Analytics Jumpstart (MAJ) is a terraform automated, quick-to-deploy, customizable end-to-end marketing solution on Google Cloud Platform (GCP). This solution aims at helping customer better understand and better use their digital advertising budget.

Customers are looking to drive revenue and increase media efficiency be identifying, predicting and targeting valuable users through the use of machine learning. However, marketers first have to solve the challenge of having a number of disparate data sources that prevent them from having a holistic view of customers. Marketers also often don't have the expertise and/or resources in their marketing departments to train, run, and activate ML models on paid channels. Without this solution that enables innovation through predictive analytics, marketers are missing opportunities to advance their marketing program and accelerate key goals and objectives (e.g. acquire new customers, improve customer retention, etc).

## Version Variants

| Version Name | Branch | Purpose |
| ------------ | ------ | ------- |
| Multi Stream Activation | [multi-stream-activation](https://github.com/GoogleCloudPlatform/marketing-analytics-jumpstart/tree/multi-stream-activation) | Activate to multiple Google Analytics 4 data streams (websites and application). |
| Multi Property | [multi-property](https://github.com/GoogleCloudPlatform/marketing-analytics-jumpstart/tree/multi-property) | Deployment of multiple MAJ resources per each Google Analytics 4 property in the same Google Cloud project. |

## Quick Installation ⏰

Want to quickly install and use it? Run this [installation notebook 📔](https://colab.sandbox.google.com/github/GoogleCloudPlatform/marketing-analytics-jumpstart/blob/main/notebooks/quick_installation.ipynb) on Google Colaboratory and leverage Marketing Analytics Jumpstart in under 30 minutes.
Expand Down Expand Up @@ -112,7 +119,7 @@ This high-level architecture demonstrates how Marketing Analytics Jumpstart inte
- [ ] [Backfill](https://cloud.google.com/bigquery/docs/google-ads-transfer) BigQuery Data Transfer service for Google Ads
- [ ] Have existing Google Analytics 4 Property with [Measurement ID](https://support.google.com/analytics/answer/12270356?hl=en)

**Note:** Google Ads Customer Matching currently only works with Google Analytics 4 **Properties** linked to Google Ads Accounts, it won't work for subproperties or Rollup properties.
**Note:** Google Ads Customer Matching currently only works with Google Analytics 4 **Property** and **Subproperty** linked to Google Ads Accounts, it won't work for Rollup properties.

## Installation Permissions and Privileges
- [ ] Google Analytics Property Editor or Owner
Expand Down
44 changes: 12 additions & 32 deletions config/config.yaml.tftpl
Original file line number Diff line number Diff line change
Expand Up @@ -195,9 +195,7 @@ vertex_ai:
subnetwork: "default"
# If you want to use the vpc network defined above, set the following flag to true
use_private_service_access: false
# The `state` defines the state of the pipeline.
# In case you don't want to schedule the pipeline, set the state to `PAUSED`.
state: PAUSED # possible states ACTIVE or PAUSED
state: '${pipeline_configuration.feature-creation-auto-audience-segmentation.execution.schedule.state}'
# The `pipeline_parameters` defines the parameters that are going to be used to compile the pipeline.
# Those values may difer depending on the pipeline type and the pipeline steps being used.
# Make sure you review the python function the defines the pipeline.
Expand Down Expand Up @@ -279,9 +277,7 @@ vertex_ai:
subnetwork: "default"
# If you want to use the vpc network defined above, set the following flag to true
use_private_service_access: false
# The `state` defines the state of the pipeline.
# In case you don't want to schedule the pipeline, set the state to `PAUSED`.
state: PAUSED # possible states ACTIVE or PAUSED
state: '${pipeline_configuration.feature-creation-audience-segmentation.execution.schedule.state}'
# The `pipeline_parameters` defines the parameters that are going to be used to compile the pipeline.
# Those values may difer depending on the pipeline type and the pipeline steps being used.
# Make sure you review the python function the defines the pipeline.
Expand Down Expand Up @@ -344,9 +340,7 @@ vertex_ai:
subnetwork: "default"
# If you want to use the vpc network defined above, set the following flag to true
use_private_service_access: false
# The `state` defines the state of the pipeline.
# In case you don't want to schedule the pipeline, set the state to `PAUSED`.
state: PAUSED # possible states ACTIVE or PAUSED
state: '${pipeline_configuration.feature-creation-purchase-propensity.execution.schedule.state}'
pipeline_parameters:
project_id: "${project_id}"
location: "${location}"
Expand Down Expand Up @@ -407,9 +401,7 @@ vertex_ai:
subnetwork: "default"
# If you want to use the vpc network defined above, set the following flag to true
use_private_service_access: false
# The `state` defines the state of the pipeline.
# In case you don't want to schedule the pipeline, set the state to `PAUSED`.
state: PAUSED # possible states ACTIVE or PAUSED
state: '${pipeline_configuration.feature-creation-churn-propensity.execution.schedule.state}'
pipeline_parameters:
project_id: "${project_id}"
location: "${location}"
Expand Down Expand Up @@ -464,9 +456,7 @@ vertex_ai:
subnetwork: "default"
# If you want to use the vpc network defined above, set the following flag to true
use_private_service_access: false
# The `state` defines the state of the pipeline.
# In case you don't want to schedule the pipeline, set the state to `PAUSED`.
state: PAUSED # possible states ACTIVE or PAUSED
state: '${pipeline_configuration.feature-creation-customer-ltv.execution.schedule.state}'
pipeline_parameters:
project_id: "${project_id}"
location: "${location}"
Expand Down Expand Up @@ -527,9 +517,7 @@ vertex_ai:
subnetwork: "default"
# If you want to use the vpc network defined above, set the following flag to true
use_private_service_access: false
# The `state` defines the state of the pipeline.
# In case you don't want to schedule the pipeline, set the state to `PAUSED`.
state: PAUSED # possible states ACTIVE or PAUSED
state: '${pipeline_configuration.feature-creation-aggregated-value-based-bidding.execution.schedule.state}'
pipeline_parameters:
project_id: "${project_id}"
location: "${location}"
Expand Down Expand Up @@ -575,9 +563,7 @@ vertex_ai:
subnetwork: "default"
# If you want to use the vpc network defined above, set the following flag to true
use_private_service_access: false
# The `state` defines the state of the pipeline.
# In case you don't want to schedule the pipeline, set the state to `PAUSED`.
state: PAUSED # possible states ACTIVE or PAUSED
state: '${pipeline_configuration.value_based_bidding.training.schedule.state}'
# These are pipeline parameters that will be passed to the pipeline to be recompiled
pipeline_parameters:
project: "${project_id}"
Expand Down Expand Up @@ -658,9 +644,7 @@ vertex_ai:
subnetwork: "default"
# If you want to use the vpc network defined above, set the following flag to true
use_private_service_access: false
# The `state` defines the state of the pipeline.
# In case you don't want to schedule the pipeline, set the state to `PAUSED`.
state: PAUSED # possible states ACTIVE or PAUSED
state: '${pipeline_configuration.value_based_bidding.explanation.schedule.state}'
pipeline_parameters:
project: "${project_id}"
location: "${cloud_region}"
Expand Down Expand Up @@ -705,9 +689,7 @@ vertex_ai:
subnetwork: "default"
# If you want to use the vpc network defined above, set the following flag to true
use_private_service_access: false
# The `state` defines the state of the pipeline.
# In case you don't want to schedule the pipeline, set the state to `PAUSED`.
state: PAUSED # possible states ACTIVE or PAUSED
state: '${pipeline_configuration.purchase_propensity.training.schedule.state}'
# These are pipeline parameters that will be passed to the pipeline to be recompiled
pipeline_parameters:
project: "${project_id}"
Expand Down Expand Up @@ -808,9 +790,7 @@ vertex_ai:
subnetwork: "default"
# If you want to use the vpc network defined above, set the following flag to true
use_private_service_access: false
# The `state` defines the state of the pipeline.
# In case you don't want to schedule the pipeline, set the state to `PAUSED`.
state: PAUSED # possible states ACTIVE or PAUSED
state: '${pipeline_configuration.purchase_propensity.prediction.schedule.state}'
pipeline_parameters:
project_id: "${project_id}"
location: "${cloud_region}"
Expand Down Expand Up @@ -870,10 +850,10 @@ vertex_ai:
# Follow the guide: https://cloud.google.com/vertex-ai/docs/general/vpc-peering
subnetwork: "default"
# If you want to use the vpc network defined above, set the following flag to true
use_private_service_access: false
use_private_service_access: false
# The `state` defines the state of the pipeline.
# In case you don't want to schedule the pipeline, set the state to `PAUSED`.
state: PAUSED # possible states ACTIVE or PAUSED
state: '${pipeline_configuration.churn_propensity.training.schedule.state}'
# These are pipeline parameters that will be passed to the pipeline to be recompiled
pipeline_parameters:
project: "${project_id}"
Expand Down
7 changes: 3 additions & 4 deletions docs/data_store.md
Original file line number Diff line number Diff line change
Expand Up @@ -107,12 +107,11 @@ To deploy the Marketing Data Store, follow the pre-requisites and instructions i
Next, after creating the Terraform variables file by making a copy from the template, set the Terraform variables to create the environments you need for Dataform.

```bash
create_dev_environment = false
create_staging_environment = false
create_prod_environment = true
deploy_dataform = true
property_id = "PROPERTY_ID"
```

When the `create_dev_environment` variable is set to `true`, a development environment will be created. When the `create_staging_environment` variable is set to `true`, a staging environment will be created. When the `create_prod_environment` variable is set to `true`, a production environment will be created.
When the `deploy_dataform` variable is set to `true`, a dataform workspace will be created.

![Dataform Repository](images/data_store_dataform_github_repository.png)
After deploying the Marketing Data Store, the repository called `marketing_analytics` is created in Dataform.
Expand Down
6 changes: 2 additions & 4 deletions infrastructure/cloudshell/terraform-template.tfvars
Original file line number Diff line number Diff line change
Expand Up @@ -17,10 +17,7 @@
tf_state_project_id = "${MAJ_DEFAULT_PROJECT_ID}"
google_default_region = "${MAJ_DEFAULT_REGION}"

create_dev_environment = false
create_staging_environment = false
create_prod_environment = true

deploy_dataform = true
deploy_activation = true
deploy_feature_store = true
deploy_pipelines = true
Expand All @@ -30,6 +27,7 @@ deploy_monitoring = true

data_project_id = "${MAJ_MDS_PROJECT_ID}"
destination_data_location = "${MAJ_MDS_DATA_LOCATION}"
property_id = "${MAJ_GA4_PROPERTY_ID}"
data_processing_project_id = "${MAJ_MDS_DATAFORM_PROJECT_ID}"
source_ga4_export_project_id = "${MAJ_GA4_EXPORT_PROJECT_ID}"
source_ga4_export_dataset = "${MAJ_GA4_EXPORT_DATASET}"
Expand Down
40 changes: 6 additions & 34 deletions infrastructure/cloudshell/tutorial.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,45 +12,17 @@ export PROJECT_ID="<walkthrough-project-id/>"
gcloud config set project $PROJECT_ID
```

## Install or update Python3
Install a compatible version of Python 3.8-3.10 and set the CLOUDSDK_PYTHON environment variable to point to it.
```sh
sudo apt-get install python3.10
CLOUDSDK_PYTHON=python3.10
```
## Install update uv for running python scripts
Install [uv](https://docs.astral.sh/uv/) that manages the python version and dependecies for the solution.

## Install Python's Poetry and set Poetry to use Python 3.10 version
[Poetry](https://python-poetry.org/docs/) is a Python's tool for dependency management and packaging.
If you are installing on in Cloud Shell use the following commands:
```sh
pipx install poetry
```
If you don't have pipx installed - follow the [Pipx installation guide](https://pipx.pypa.io/stable/installation/)
```sh
sudo apt update
sudo apt install pipx
pipx ensurepath
pipx install poetry
```
Verify that `poetry` is on your $PATH variable:
```sh
poetry --version
```
If it fails - add it to your $PATH variable:
```sh
curl -LsSf https://astral.sh/uv/install.sh | sh
export PATH="$HOME/.local/bin:$PATH"
```
Verify poetry is properly installed, run:
```sh
poetry --version
```
Set poetry to use your latest python3
```sh
poetry env use python3
```
Install python dependencies, run:

Check uv installation
```sh
poetry install
uv --version
```

## Authenticate with additional OAuth 2.0 scopes
Expand Down
19 changes: 19 additions & 0 deletions infrastructure/terraform/.terraform.lock.hcl

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

68 changes: 21 additions & 47 deletions infrastructure/terraform/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,52 +43,18 @@ Also, this method allows you to extend this solution and develop it to satisfy y
gcloud config set project $PROJECT_ID
```

1. Install or update Python3
Install a compatible version of Python 3.8-3.10 and set the CLOUDSDK_PYTHON environment variable to point to it.
1. Install update uv for running python scripts
Install [uv](https://docs.astral.sh/uv/) that manages the python version and dependecies for the solution.

```bash
sudo apt-get install python3.10
CLOUDSDK_PYTHON=python3.10
```sh
curl -LsSf https://astral.sh/uv/install.sh | sh
export PATH="$HOME/.local/bin:$PATH"
```
If you are installing on a Mac:
```shell
brew install [email protected]
CLOUDSDK_PYTHON=python3.10
```

1. Install Python's Poetry and set Poetry to use Python 3.10 version

[Poetry](https://python-poetry.org/docs/) is a Python's tool for dependency management and packaging.

If you are installing on in Cloud Shell use the following commands:
```shell
pipx install poetry
```
If you don't have pipx installed - follow the [Pipx installation guide](https://pipx.pypa.io/stable/installation/)
```shell
sudo apt update
sudo apt install pipx
pipx ensurepath
pipx install poetry
```
Verify that `poetry` is on your $PATH variable:
```shell
poetry --version
```
If it fails - add it to your $PATH variable:
```shell
export PATH="$HOME/.local/bin:$PATH"
```
If you are installing on a Mac:
```shell
brew install poetry
```
Set poetry to use your latest python3
```shell
SOURCE_ROOT=${HOME}/${REPO}
cd ${SOURCE_ROOT}
poetry env use python3
```
Check uv installation:
```sh
uv --version
```

1. Authenticate with additional OAuth 2.0 scopes needed to use the Google Analytics Admin API:
```shell
Expand Down Expand Up @@ -121,6 +87,14 @@ Also, this method allows you to extend this solution and develop it to satisfy y
terraform --version
```

**Note:** If you have a Apple Silicon Macbook, you should install terraform by setting the `TFENV_ARCH` environment variable:
```shell
TFENV_ARCH=amd64 tfenv install 1.9.7
tfenv use 1.9.7
terraform --version
```
If not properly terraform version for your architecture is installed, `terraform .. init` will fail.

For instance, the output on MacOS should be like:
```shell
Terraform v1.9.7
Expand Down Expand Up @@ -188,10 +162,10 @@ Because a Cloud Shell session is ephemeral, your Cloud Shell session could termi

Reset your Google Cloud Project ID variables:

```shell
export PROJECT_ID="[your Google Cloud project id]"
gcloud config set project $PROJECT_ID
```
```bash
export PROJECT_ID="[your Google Cloud project id]"
gcloud config set project $PROJECT_ID
```

Follow the authentication workflow, since your credentials expires daily:

Expand Down
Loading
Loading