Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rebase Main into Multi-stream #244

Open
wants to merge 110 commits into
base: multi-stream-activation
Choose a base branch
from
Open

Conversation

chmstimoteo
Copy link
Collaborator

Description

Please provide a description of the changes you have made in this PR.

How has this been tested?

Please explain how you have tested the new changes.

Checklist

  • I have commented my code, particularly in hard-to-understand areas
  • I have successfully run the E2E tests, and have included the links to the pipeline runs below
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • I have updated any relevant documentation to reflect my changes
  • I have assigned a reviewer and messaged them

Pipeline run links:

kingman and others added 30 commits September 11, 2024 16:30
* specify dashboard dependent tables

* remove project-id placeholder
* predicting for only the users with traffic in the past 72h - purchase propensity

* running inference only for users events in the past 72h

* including 72h users for all models predictions

* considering null values in TabWorkflow models

* deleting unused pipfile

* upgrading lib versions

* implementing reporting preprocessing as a new pipeline

* adding more code documentation

* adding important information on the main README.md and DEVELOPMENT.md

* adding schedule run name and more code documentation

* implementing a new scheduler using the vertex ai sdk & adding user_id to procedures for consistency

* adding more code documentation

* adding code doc to the python custom component

* adding more code documentation

* fixing aggregated predictions query

* removing unnecessary resources from deployment

* Writing MDS guide

* adding the MDS developer and troubleshooting documentation

* fixing deployment for activation pipelines and gemini dataset

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* removing deprecated api

* fixing purchase propensity pipelines names

* adding extra condition for when there is not enough data for the window interval to be applied on backfill procedures

* adding more instructions for post deployment and fixing issues when GA4 export was configured for less than 10 days

* removing unnecessary comments

* adding the number of past days to process in the variables files

* adding comment about combining data from different ga4 export datasets to data store

* fixing small issues with feature engineering and ml pipelines

* fixing hyper parameter tuning for kmeans modeling

* fixing optuna parameters

* adding cloud shell image

* fixing the list of all possible users in the propensity training preparation tables

* additional guardrails for when there is not enough data

* adding more documentation

* adding more doc to feature store

* add feature store documentation

* adding ml pipelines docs

* adding ml pipelines docs

* adding more documentation

* adding user agent client info

* fixing scope of client info

* fix

* removing client_info from vertex components

* fixing versioning of tf submodules

* reconfiguring meta providers

* fixing issue 187

* adding quick installation process

* removing state active

---------

Co-authored-by: Carlos Timoteo <[email protected]>
* predicting for only the users with traffic in the past 72h - purchase propensity

* running inference only for users events in the past 72h

* including 72h users for all models predictions

* considering null values in TabWorkflow models

* deleting unused pipfile

* upgrading lib versions

* implementing reporting preprocessing as a new pipeline

* adding more code documentation

* adding important information on the main README.md and DEVELOPMENT.md

* adding schedule run name and more code documentation

* implementing a new scheduler using the vertex ai sdk & adding user_id to procedures for consistency

* adding more code documentation

* adding code doc to the python custom component

* adding more code documentation

* fixing aggregated predictions query

* removing unnecessary resources from deployment

* Writing MDS guide

* adding the MDS developer and troubleshooting documentation

* fixing deployment for activation pipelines and gemini dataset

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* removing deprecated api

* fixing purchase propensity pipelines names

* adding extra condition for when there is not enough data for the window interval to be applied on backfill procedures

* adding more instructions for post deployment and fixing issues when GA4 export was configured for less than 10 days

* removing unnecessary comments

* adding the number of past days to process in the variables files

* adding comment about combining data from different ga4 export datasets to data store

* fixing small issues with feature engineering and ml pipelines

* fixing hyper parameter tuning for kmeans modeling

* fixing optuna parameters

* adding cloud shell image

* fixing the list of all possible users in the propensity training preparation tables

* additional guardrails for when there is not enough data

* adding more documentation

* adding more doc to feature store

* add feature store documentation

* adding ml pipelines docs

* adding ml pipelines docs

* adding more documentation

* adding user agent client info

* fixing scope of client info

* fix

* removing client_info from vertex components

* fixing versioning of tf submodules

* reconfiguring meta providers

* fixing issue 187

* adding quick installation process

* removing state active

* fixing notebook header

---------

Co-authored-by: Carlos Timoteo <[email protected]>
* predicting for only the users with traffic in the past 72h - purchase propensity

* running inference only for users events in the past 72h

* including 72h users for all models predictions

* considering null values in TabWorkflow models

* deleting unused pipfile

* upgrading lib versions

* implementing reporting preprocessing as a new pipeline

* adding more code documentation

* adding important information on the main README.md and DEVELOPMENT.md

* adding schedule run name and more code documentation

* implementing a new scheduler using the vertex ai sdk & adding user_id to procedures for consistency

* adding more code documentation

* adding code doc to the python custom component

* adding more code documentation

* fixing aggregated predictions query

* removing unnecessary resources from deployment

* Writing MDS guide

* adding the MDS developer and troubleshooting documentation

* fixing deployment for activation pipelines and gemini dataset

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* removing deprecated api

* fixing purchase propensity pipelines names

* adding extra condition for when there is not enough data for the window interval to be applied on backfill procedures

* adding more instructions for post deployment and fixing issues when GA4 export was configured for less than 10 days

* removing unnecessary comments

* adding the number of past days to process in the variables files

* adding comment about combining data from different ga4 export datasets to data store

* fixing small issues with feature engineering and ml pipelines

* fixing hyper parameter tuning for kmeans modeling

* fixing optuna parameters

* adding cloud shell image

* fixing the list of all possible users in the propensity training preparation tables

* additional guardrails for when there is not enough data

* adding more documentation

* adding more doc to feature store

* add feature store documentation

* adding ml pipelines docs

* adding ml pipelines docs

* adding more documentation

* adding user agent client info

* fixing scope of client info

* fix

* removing client_info from vertex components

* fixing versioning of tf submodules

* reconfiguring meta providers

* fixing issue 187

* adding quick installation process

* removing state active

* fixing notebook header

* removing notebook cells outputs

---------

Co-authored-by: Carlos Timoteo <[email protected]>
* chore(deps): upgrade terraform providers and modules version

* chore(deps): set the provider version

* chore: formatting

* fix: brand naming

* fix: typo

---------

Co-authored-by: Laurent Grangeau <[email protected]>
* predicting for only the users with traffic in the past 72h - purchase propensity

* running inference only for users events in the past 72h

* including 72h users for all models predictions

* considering null values in TabWorkflow models

* deleting unused pipfile

* upgrading lib versions

* implementing reporting preprocessing as a new pipeline

* adding more code documentation

* adding important information on the main README.md and DEVELOPMENT.md

* adding schedule run name and more code documentation

* implementing a new scheduler using the vertex ai sdk & adding user_id to procedures for consistency

* adding more code documentation

* adding code doc to the python custom component

* adding more code documentation

* fixing aggregated predictions query

* removing unnecessary resources from deployment

* Writing MDS guide

* adding the MDS developer and troubleshooting documentation

* fixing deployment for activation pipelines and gemini dataset

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* removing deprecated api

* fixing purchase propensity pipelines names

* adding extra condition for when there is not enough data for the window interval to be applied on backfill procedures

* adding more instructions for post deployment and fixing issues when GA4 export was configured for less than 10 days

* removing unnecessary comments

* adding the number of past days to process in the variables files

* adding comment about combining data from different ga4 export datasets to data store

* fixing small issues with feature engineering and ml pipelines

* fixing hyper parameter tuning for kmeans modeling

* fixing optuna parameters

* adding cloud shell image

* fixing the list of all possible users in the propensity training preparation tables

* additional guardrails for when there is not enough data

* adding more documentation

* adding more doc to feature store

* add feature store documentation

* adding ml pipelines docs

* adding ml pipelines docs

* adding more documentation

* adding user agent client info

* fixing scope of client info

* fix

* removing client_info from vertex components

* fixing versioning of tf submodules

* reconfiguring meta providers

* fixing issue 187

* chore(deps): upgrade terraform providers and modules version

* chore(deps): set the provider version

* chore: formatting

* fix: brand naming

* fix: typo

* fixing secrets issue

---------

Co-authored-by: Carlos Timoteo <[email protected]>
Co-authored-by: Laurent Grangeau <[email protected]>
chmstimoteo and others added 30 commits November 14, 2024 13:59
* add uv required project table segment in toml file

* switch to uv in terraform deployment

* switch to uv

* remove poetry usage from terraform

* format

* remove poetry

* Add files via upload
* predicting for only the users with traffic in the past 72h - purchase propensity

* running inference only for users events in the past 72h

* including 72h users for all models predictions

* considering null values in TabWorkflow models

* deleting unused pipfile

* upgrading lib versions

* implementing reporting preprocessing as a new pipeline

* adding more code documentation

* adding important information on the main README.md and DEVELOPMENT.md

* adding schedule run name and more code documentation

* implementing a new scheduler using the vertex ai sdk & adding user_id to procedures for consistency

* adding more code documentation

* adding code doc to the python custom component

* adding more code documentation

* fixing aggregated predictions query

* removing unnecessary resources from deployment

* Writing MDS guide

* adding the MDS developer and troubleshooting documentation

* fixing deployment for activation pipelines and gemini dataset

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* removing deprecated api

* fixing purchase propensity pipelines names

* adding extra condition for when there is not enough data for the window interval to be applied on backfill procedures

* adding more instructions for post deployment and fixing issues when GA4 export was configured for less than 10 days

* removing unnecessary comments

* adding the number of past days to process in the variables files

* adding comment about combining data from different ga4 export datasets to data store

* fixing small issues with feature engineering and ml pipelines

* fixing hyper parameter tuning for kmeans modeling

* fixing optuna parameters

* adding cloud shell image

* fixing the list of all possible users in the propensity training preparation tables

* additional guardrails for when there is not enough data

* adding more documentation

* adding more doc to feature store

* add feature store documentation

* adding ml pipelines docs

* adding ml pipelines docs

* adding more documentation

* adding user agent client info

* fixing scope of client info

* fix

* removing client_info from vertex components

* fixing versioning of tf submodules

* reconfiguring meta providers

* fixing issue 187

* chore(deps): upgrade terraform providers and modules version

* chore(deps): set the provider version

* chore: formatting

* fix: brand naming

* fix: typo

* fixing secrets issue

* implementing secrets region as tf variable

* implementing secrets region as tf variable

* last changes requested by lgrangeau

* documenting keys location better

* implementing vpc peering network

* Update README.md

* Rebase Main into Multi-property (#243)

* Update README.md

* ensure the build bucket is created in the specified region (#230)

* Update audience_segmentation_query_template.sqlx

* Update auto_audience_segmentation_query_template.sqlx

* Update churn_propensity_query_template.sqlx

* Update cltv_query_template.sqlx

* Update purchase_propensity_query_template.sqlx

* Restrict regions for GCP Cloud Build support (#241)

* Update README.md

* Move to uv (#242)

* add uv required project table segment in toml file

* switch to uv in terraform deployment

* switch to uv

* remove poetry usage from terraform

* format

* remove poetry

* Add files via upload

---------

Co-authored-by: Charlie Wang <[email protected]>
Co-authored-by: Mårten Lindblad <[email protected]>

* supporting property id in the resources

---------

Co-authored-by: Carlos Timoteo <[email protected]>
Co-authored-by: Laurent Grangeau <[email protected]>
Co-authored-by: Charlie Wang <[email protected]>
Co-authored-by: Mårten Lindblad <[email protected]>
* predicting for only the users with traffic in the past 72h - purchase propensity

* running inference only for users events in the past 72h

* including 72h users for all models predictions

* considering null values in TabWorkflow models

* deleting unused pipfile

* upgrading lib versions

* implementing reporting preprocessing as a new pipeline

* adding more code documentation

* adding important information on the main README.md and DEVELOPMENT.md

* adding schedule run name and more code documentation

* implementing a new scheduler using the vertex ai sdk & adding user_id to procedures for consistency

* adding more code documentation

* adding code doc to the python custom component

* adding more code documentation

* fixing aggregated predictions query

* removing unnecessary resources from deployment

* Writing MDS guide

* adding the MDS developer and troubleshooting documentation

* fixing deployment for activation pipelines and gemini dataset

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* removing deprecated api

* fixing purchase propensity pipelines names

* adding extra condition for when there is not enough data for the window interval to be applied on backfill procedures

* adding more instructions for post deployment and fixing issues when GA4 export was configured for less than 10 days

* removing unnecessary comments

* adding the number of past days to process in the variables files

* adding comment about combining data from different ga4 export datasets to data store

* fixing small issues with feature engineering and ml pipelines

* fixing hyper parameter tuning for kmeans modeling

* fixing optuna parameters

* adding cloud shell image

* fixing the list of all possible users in the propensity training preparation tables

* additional guardrails for when there is not enough data

* adding more documentation

* adding more doc to feature store

* add feature store documentation

* adding ml pipelines docs

* adding ml pipelines docs

* adding more documentation

* adding user agent client info

* fixing scope of client info

* fix

* removing client_info from vertex components

* fixing versioning of tf submodules

* reconfiguring meta providers

* fixing issue 187

* chore(deps): upgrade terraform providers and modules version

* chore(deps): set the provider version

* chore: formatting

* fix: brand naming

* fix: typo

* fixing secrets issue

* implementing secrets region as tf variable

* implementing secrets region as tf variable

* last changes requested by lgrangeau

* documenting keys location better

* implementing vpc peering network

* Update README.md

* Rebase Main into Multi-property (#243)

* Update README.md

* ensure the build bucket is created in the specified region (#230)

* Update audience_segmentation_query_template.sqlx

* Update auto_audience_segmentation_query_template.sqlx

* Update churn_propensity_query_template.sqlx

* Update cltv_query_template.sqlx

* Update purchase_propensity_query_template.sqlx

* Restrict regions for GCP Cloud Build support (#241)

* Update README.md

* Move to uv (#242)

* add uv required project table segment in toml file

* switch to uv in terraform deployment

* switch to uv

* remove poetry usage from terraform

* format

* remove poetry

* Add files via upload

---------

Co-authored-by: Charlie Wang <[email protected]>
Co-authored-by: Mårten Lindblad <[email protected]>

* supporting property id in the resources

* fixing iam member roles issues

---------

Co-authored-by: Carlos Timoteo <[email protected]>
Co-authored-by: Laurent Grangeau <[email protected]>
Co-authored-by: Charlie Wang <[email protected]>
Co-authored-by: Mårten Lindblad <[email protected]>
* predicting for only the users with traffic in the past 72h - purchase propensity

* running inference only for users events in the past 72h

* including 72h users for all models predictions

* considering null values in TabWorkflow models

* deleting unused pipfile

* upgrading lib versions

* implementing reporting preprocessing as a new pipeline

* adding more code documentation

* adding important information on the main README.md and DEVELOPMENT.md

* adding schedule run name and more code documentation

* implementing a new scheduler using the vertex ai sdk & adding user_id to procedures for consistency

* adding more code documentation

* adding code doc to the python custom component

* adding more code documentation

* fixing aggregated predictions query

* removing unnecessary resources from deployment

* Writing MDS guide

* adding the MDS developer and troubleshooting documentation

* fixing deployment for activation pipelines and gemini dataset

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* removing deprecated api

* fixing purchase propensity pipelines names

* adding extra condition for when there is not enough data for the window interval to be applied on backfill procedures

* adding more instructions for post deployment and fixing issues when GA4 export was configured for less than 10 days

* removing unnecessary comments

* adding the number of past days to process in the variables files

* adding comment about combining data from different ga4 export datasets to data store

* fixing small issues with feature engineering and ml pipelines

* fixing hyper parameter tuning for kmeans modeling

* fixing optuna parameters

* adding cloud shell image

* fixing the list of all possible users in the propensity training preparation tables

* additional guardrails for when there is not enough data

* adding more documentation

* adding more doc to feature store

* add feature store documentation

* adding ml pipelines docs

* adding ml pipelines docs

* adding more documentation

* adding user agent client info

* fixing scope of client info

* fix

* removing client_info from vertex components

* fixing versioning of tf submodules

* reconfiguring meta providers

* fixing issue 187

* chore(deps): upgrade terraform providers and modules version

* chore(deps): set the provider version

* chore: formatting

* fix: brand naming

* fix: typo

* fixing secrets issue

* implementing secrets region as tf variable

* implementing secrets region as tf variable

* last changes requested by lgrangeau

* documenting keys location better

* implementing vpc peering network

* Update README.md

* Rebase Main into Multi-property (#243)

* Update README.md

* ensure the build bucket is created in the specified region (#230)

* Update audience_segmentation_query_template.sqlx

* Update auto_audience_segmentation_query_template.sqlx

* Update churn_propensity_query_template.sqlx

* Update cltv_query_template.sqlx

* Update purchase_propensity_query_template.sqlx

* Restrict regions for GCP Cloud Build support (#241)

* Update README.md

* Move to uv (#242)

* add uv required project table segment in toml file

* switch to uv in terraform deployment

* switch to uv

* remove poetry usage from terraform

* format

* remove poetry

* Add files via upload

---------

Co-authored-by: Charlie Wang <[email protected]>
Co-authored-by: Mårten Lindblad <[email protected]>

* supporting property id in the resources

* fixing iam member roles issues

* fixing issue with service account iam resources

---------

Co-authored-by: Carlos Timoteo <[email protected]>
Co-authored-by: Laurent Grangeau <[email protected]>
Co-authored-by: Charlie Wang <[email protected]>
Co-authored-by: Mårten Lindblad <[email protected]>
* predicting for only the users with traffic in the past 72h - purchase propensity

* running inference only for users events in the past 72h

* including 72h users for all models predictions

* considering null values in TabWorkflow models

* deleting unused pipfile

* upgrading lib versions

* implementing reporting preprocessing as a new pipeline

* adding more code documentation

* adding important information on the main README.md and DEVELOPMENT.md

* adding schedule run name and more code documentation

* implementing a new scheduler using the vertex ai sdk & adding user_id to procedures for consistency

* adding more code documentation

* adding code doc to the python custom component

* adding more code documentation

* fixing aggregated predictions query

* removing unnecessary resources from deployment

* Writing MDS guide

* adding the MDS developer and troubleshooting documentation

* fixing deployment for activation pipelines and gemini dataset

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* removing deprecated api

* fixing purchase propensity pipelines names

* adding extra condition for when there is not enough data for the window interval to be applied on backfill procedures

* adding more instructions for post deployment and fixing issues when GA4 export was configured for less than 10 days

* removing unnecessary comments

* adding the number of past days to process in the variables files

* adding comment about combining data from different ga4 export datasets to data store

* fixing small issues with feature engineering and ml pipelines

* fixing hyper parameter tuning for kmeans modeling

* fixing optuna parameters

* adding cloud shell image

* fixing the list of all possible users in the propensity training preparation tables

* additional guardrails for when there is not enough data

* adding more documentation

* adding more doc to feature store

* add feature store documentation

* adding ml pipelines docs

* adding ml pipelines docs

* adding more documentation

* adding user agent client info

* fixing scope of client info

* fix

* removing client_info from vertex components

* fixing versioning of tf submodules

* reconfiguring meta providers

* fixing issue 187

* chore(deps): upgrade terraform providers and modules version

* chore(deps): set the provider version

* chore: formatting

* fix: brand naming

* fix: typo

* fixing secrets issue

* implementing secrets region as tf variable

* implementing secrets region as tf variable

* last changes requested by lgrangeau

* documenting keys location better

* implementing vpc peering network

* Update README.md

* Rebase Main into Multi-property (#243)

* Update README.md

* ensure the build bucket is created in the specified region (#230)

* Update audience_segmentation_query_template.sqlx

* Update auto_audience_segmentation_query_template.sqlx

* Update churn_propensity_query_template.sqlx

* Update cltv_query_template.sqlx

* Update purchase_propensity_query_template.sqlx

* Restrict regions for GCP Cloud Build support (#241)

* Update README.md

* Move to uv (#242)

* add uv required project table segment in toml file

* switch to uv in terraform deployment

* switch to uv

* remove poetry usage from terraform

* format

* remove poetry

* Add files via upload

---------

Co-authored-by: Charlie Wang <[email protected]>
Co-authored-by: Mårten Lindblad <[email protected]>

* supporting property id in the resources

* fixing iam member roles issues

* fixing issue with service account iam resources

* fixing issue with connection between vertex and bq

---------

Co-authored-by: Carlos Timoteo <[email protected]>
Co-authored-by: Laurent Grangeau <[email protected]>
Co-authored-by: Charlie Wang <[email protected]>
Co-authored-by: Mårten Lindblad <[email protected]>
…s are executable with a previous created feature table (#254)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants