Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move to uv #242

Merged
merged 7 commits into from
Nov 15, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion DEVELOPMENT.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ Marketing Analytics Jumpstart consists of an easy, extensible and automated impl

## Developer pre-requisites
Use Visual Studio Code to develop the solution. Install Gemini Code Assistant, Docker, GitHub, Hashicorp, Terraform, Jinja extensions.
You should have Python 3, Poetry, Terraform, Git and Docker installed in your developer terminal environment.
You should have Python 3, uv, Terraform, Git and Docker installed in your developer terminal environment.

## Preparing development environment

Expand Down
40 changes: 6 additions & 34 deletions infrastructure/cloudshell/tutorial.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,45 +12,17 @@ export PROJECT_ID="<walkthrough-project-id/>"
gcloud config set project $PROJECT_ID
```

## Install or update Python3
Install a compatible version of Python 3.8-3.10 and set the CLOUDSDK_PYTHON environment variable to point to it.
```sh
sudo apt-get install python3.10
CLOUDSDK_PYTHON=python3.10
```
## Install update uv for running python scripts
Install [uv](https://docs.astral.sh/uv/) that manages the python version and dependecies for the solution.

## Install Python's Poetry and set Poetry to use Python 3.10 version
[Poetry](https://python-poetry.org/docs/) is a Python's tool for dependency management and packaging.
If you are installing on in Cloud Shell use the following commands:
```sh
pipx install poetry
```
If you don't have pipx installed - follow the [Pipx installation guide](https://pipx.pypa.io/stable/installation/)
```sh
sudo apt update
sudo apt install pipx
pipx ensurepath
pipx install poetry
```
Verify that `poetry` is on your $PATH variable:
```sh
poetry --version
```
If it fails - add it to your $PATH variable:
```sh
curl -LsSf https://astral.sh/uv/install.sh | sh
export PATH="$HOME/.local/bin:$PATH"
```
Verify poetry is properly installed, run:
```sh
poetry --version
```
Set poetry to use your latest python3
```sh
poetry env use python3
```
Install python dependencies, run:

Check uv installation
```sh
poetry install
uv --version
```

## Authenticate with additional OAuth 2.0 scopes
Expand Down
52 changes: 9 additions & 43 deletions infrastructure/terraform/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,52 +43,18 @@ Also, this method allows you to extend this solution and develop it to satisfy y
gcloud config set project $PROJECT_ID
```

1. Install or update Python3
Install a compatible version of Python 3.8-3.10 and set the CLOUDSDK_PYTHON environment variable to point to it.
1. Install update uv for running python scripts
Install [uv](https://docs.astral.sh/uv/) that manages the python version and dependecies for the solution.

```bash
sudo apt-get install python3.10
CLOUDSDK_PYTHON=python3.10
```sh
curl -LsSf https://astral.sh/uv/install.sh | sh
export PATH="$HOME/.local/bin:$PATH"
```
If you are installing on a Mac:
```shell
brew install [email protected]
CLOUDSDK_PYTHON=python3.10
```

1. Install Python's Poetry and set Poetry to use Python 3.10 version

[Poetry](https://python-poetry.org/docs/) is a Python's tool for dependency management and packaging.

If you are installing on in Cloud Shell use the following commands:
```shell
pipx install poetry
```
If you don't have pipx installed - follow the [Pipx installation guide](https://pipx.pypa.io/stable/installation/)
```shell
sudo apt update
sudo apt install pipx
pipx ensurepath
pipx install poetry
```
Verify that `poetry` is on your $PATH variable:
```shell
poetry --version
```
If it fails - add it to your $PATH variable:
```shell
export PATH="$HOME/.local/bin:$PATH"
```
If you are installing on a Mac:
```shell
brew install poetry
```
Set poetry to use your latest python3
```shell
SOURCE_ROOT=${HOME}/${REPO}
cd ${SOURCE_ROOT}
poetry env use python3
```
Check uv installation:
```sh
uv --version
```

1. Authenticate with additional OAuth 2.0 scopes needed to use the Google Analytics Admin API:
```shell
Expand Down
49 changes: 11 additions & 38 deletions infrastructure/terraform/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -69,8 +69,8 @@ locals {
source_root_dir = "../.."
# The config_file_name is the name of the config file.
config_file_name = "config"
# The poetry_run_alias is the alias of the poetry command.
poetry_run_alias = "${var.poetry_cmd} run"
# The uv_run_alias is the alias of the uv run command.
uv_run_alias = "${var.uv_cmd} run"
# The mds_dataset_suffix is the suffix of the marketing data store dataset.
mds_dataset_suffix = var.create_staging_environment ? "staging" : var.create_dev_environment ? "dev" : "prod"
# The project_toml_file_path is the path to the project.toml file.
Expand Down Expand Up @@ -127,39 +127,22 @@ resource "local_file" "feature_store_configuration" {
})
}

# Runs the poetry command to install the dependencies.
# The command is: poetry install
resource "null_resource" "poetry_install" {
triggers = {
create_command = "${var.poetry_cmd} lock && ${var.poetry_cmd} install"
source_contents_hash = local.project_toml_content_hash
}

# Only run the command when `terraform apply` executes and the resource doesn't exist.
provisioner "local-exec" {
when = create
command = self.triggers.create_command
working_dir = local.source_root_dir
}
}

data "external" "check_ga4_property_type" {
program = ["bash", "-c", "${local.poetry_run_alias} ga4-setup --ga4_resource=check_property_type --ga4_property_id=${var.ga4_property_id} --ga4_stream_id=${var.ga4_stream_id}"]
program = ["bash", "-c", "${local.uv_run_alias} ga4-setup --ga4_resource=check_property_type --ga4_property_id=${var.ga4_property_id} --ga4_stream_id=${var.ga4_stream_id}"]
working_dir = local.source_root_dir
depends_on = [null_resource.poetry_install]
}

# Runs the poetry invoke command to generate the sql queries and procedures.
# Runs the uv invoke command to generate the sql queries and procedures.
# This command is executed before the feature store is created.
resource "null_resource" "generate_sql_queries" {

triggers = {
# The create command generates the sql queries and procedures.
# The command is: poetry inv [function_name] --env-name=${local.config_file_name}
# The command is: uv inv [function_name] --env-name=${local.config_file_name}
# The --env-name argument is the name of the configuration file.
create_command = <<-EOT
${local.poetry_run_alias} inv apply-config-parameters-to-all-queries --env-name=${local.config_file_name}
${local.poetry_run_alias} inv apply-config-parameters-to-all-procedures --env-name=${local.config_file_name}
${local.uv_run_alias} inv apply-config-parameters-to-all-queries --env-name=${local.config_file_name}
${local.uv_run_alias} inv apply-config-parameters-to-all-procedures --env-name=${local.config_file_name}
EOT

# The destroy command removes the generated sql queries and procedures.
Expand All @@ -171,10 +154,6 @@ resource "null_resource" "generate_sql_queries" {
# The working directory is the root of the project.
working_dir = local.source_root_dir

# The poetry_installed trigger is the ID of the null_resource.poetry_install resource.
# This is used to ensure that the poetry command is run before the generate_sql_queries command.
poetry_installed = null_resource.poetry_install.id

# The source_contents_hash trigger is the hash of the project.toml file.
# This is used to ensure that the generate_sql_queries command is run only if the project.toml file has changed.
# It also ensures that the generate_sql_queries command is run only if the sql queries and procedures have changed.
Expand Down Expand Up @@ -415,15 +394,12 @@ module "pipelines" {
# The source is the path to the pipelines module.
source = "./modules/pipelines"
config_file_path = local_file.feature_store_configuration.id != "" ? local_file.feature_store_configuration.filename : ""
poetry_run_alias = local.poetry_run_alias
uv_run_alias = local.uv_run_alias
# The count determines if the pipelines are created or not.
# If the count is 1, the pipelines are created.
# If the count is 0, the pipelines are not created.
# This is done to avoid creating the pipelines if the `deploy_pipelines` variable is set to false in the terraform.tfvars file.
count = var.deploy_pipelines ? 1 : 0
# The poetry_installed trigger is the ID of the null_resource.poetry_install resource.
# This is used to ensure that the poetry command is run before the pipelines module is created.
poetry_installed = null_resource.poetry_install.id
# The project_id is the project in which the data is stored.
# This is set to the data project ID in the terraform.tfvars file.
mds_project_id = var.data_project_id
Expand Down Expand Up @@ -454,9 +430,9 @@ module "activation" {
# The trigger function is used to trigger the activation function.
# The trigger function is created in the same region as the activation function.
trigger_function_location = var.google_default_region
# The poetry_cmd is the poetry_cmd variable.
# This can be set on the poetry_cmd in the terraform.tfvars file.
poetry_cmd = var.poetry_cmd
# The uv_run_alias is the uv_run_alias variable.
# This can be set on the uv_cmd in the terraform.tfvars file.
uv_run_alias = local.uv_run_alias
# The ga4_measurement_id is the ga4_measurement_id variable.
# This can be set on the ga4_measurement_id in the terraform.tfvars file.
ga4_measurement_id = var.ga4_measurement_id
Expand All @@ -479,9 +455,6 @@ module "activation" {
# This is done to avoid creating the activation function if the `deploy_activation` variable is set
# to false in the terraform.tfvars file.
count = var.deploy_activation ? 1 : 0
# The poetry_installed is the ID of the null_resource poetry_install
# This is used to ensure that the poetry command is run before the activation module is created.
poetry_installed = null_resource.poetry_install.id
mds_project_id = var.data_project_id
mds_dataset_suffix = local.mds_dataset_suffix

Expand Down
7 changes: 3 additions & 4 deletions infrastructure/terraform/modules/activation/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,6 @@
locals {
app_prefix = "activation"
source_root_dir = "../.."
poetry_run_alias = "${var.poetry_cmd} run"
template_dir = "${local.source_root_dir}/templates"
pipeline_source_dir = "${local.source_root_dir}/python/activation"
trigger_function_dir = "${local.source_root_dir}/python/function"
Expand Down Expand Up @@ -373,7 +372,7 @@ resource "null_resource" "create_custom_events" {
}
provisioner "local-exec" {
command = <<-EOT
${local.poetry_run_alias} ga4-setup --ga4_resource=custom_events --ga4_property_id=${var.ga4_property_id} --ga4_stream_id=${var.ga4_stream_id}
${var.uv_run_alias} ga4-setup --ga4_resource=custom_events --ga4_property_id=${var.ga4_property_id} --ga4_stream_id=${var.ga4_stream_id}
EOT
working_dir = local.source_root_dir
}
Expand All @@ -391,7 +390,7 @@ resource "null_resource" "create_custom_dimensions" {
}
provisioner "local-exec" {
command = <<-EOT
${local.poetry_run_alias} ga4-setup --ga4_resource=custom_dimensions --ga4_property_id=${var.ga4_property_id} --ga4_stream_id=${var.ga4_stream_id}
${var.uv_run_alias} ga4-setup --ga4_resource=custom_dimensions --ga4_property_id=${var.ga4_property_id} --ga4_stream_id=${var.ga4_stream_id}
EOT
working_dir = local.source_root_dir
}
Expand Down Expand Up @@ -447,7 +446,7 @@ module "trigger_function_account" {
# a python command defined in the module ga4_setup.
# This informatoin can then be used in other parts of the Terraform configuration to access the retrieved information.
data "external" "ga4_measurement_properties" {
program = ["bash", "-c", "${local.poetry_run_alias} ga4-setup --ga4_resource=measurement_properties --ga4_property_id=${var.ga4_property_id} --ga4_stream_id=${var.ga4_stream_id}"]
program = ["bash", "-c", "${var.uv_run_alias} ga4-setup --ga4_resource=measurement_properties --ga4_property_id=${var.ga4_property_id} --ga4_stream_id=${var.ga4_stream_id}"]
working_dir = local.source_root_dir
# The count attribute specifies how many times the external data source should be executed.
# This means that the external data source will be executed only if either the
Expand Down
9 changes: 2 additions & 7 deletions infrastructure/terraform/modules/activation/variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -43,8 +43,8 @@ variable "trigger_function_location" {
type = string
}

variable "poetry_cmd" {
description = "alias for poetry command on the current system"
variable "uv_run_alias" {
description = "alias for uv run command on the current system"
type = string
}

Expand Down Expand Up @@ -72,11 +72,6 @@ variable "ga4_stream_id" {
type = string
}

variable "poetry_installed" {
description = "Construct to specify dependency to poetry installed"
type = string
}

variable "mds_project_id" {
type = string
description = "MDS Project ID"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -1471,7 +1471,7 @@ resource "null_resource" "create_gemini_model" {

provisioner "local-exec" {
command = <<-EOT
${local.poetry_run_alias} bq query --use_legacy_sql=false --max_rows=100 --maximum_bytes_billed=10000000 < ${data.local_file.create_gemini_model_file.filename}
${var.uv_run_alias} bq query --use_legacy_sql=false --max_rows=100 --maximum_bytes_billed=10000000 < ${data.local_file.create_gemini_model_file.filename}
EOT
}

Expand Down
1 change: 0 additions & 1 deletion infrastructure/terraform/modules/feature-store/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,6 @@ locals {
config_bigquery = local.config_vars.bigquery
feature_store_project_id = local.config_vars.bigquery.dataset.feature_store.project_id
sql_dir = var.sql_dir_input
poetry_run_alias = "${var.poetry_cmd} run"
builder_repository_id = "marketing-analytics-jumpstart-base-repo"
purchase_propensity_project_id = null_resource.check_bigquery_api.id != "" ? local.config_vars.bigquery.dataset.purchase_propensity.project_id : local.feature_store_project_id
churn_propensity_project_id = null_resource.check_bigquery_api.id != "" ? local.config_vars.bigquery.dataset.churn_propensity.project_id : local.feature_store_project_id
Expand Down
6 changes: 3 additions & 3 deletions infrastructure/terraform/modules/feature-store/variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -37,8 +37,8 @@ variable "sql_dir_input" {
description = "SQL queries directory"
}

variable "poetry_cmd" {
description = "alias for poetry command on the current system"
variable "uv_run_alias" {
description = "alias for uv run command on the current system"
type = string
default = "poetry"
default = "uv run"
}
Loading
Loading