From 79040f45766a2b66e0506d9931a5278ca29953d6 Mon Sep 17 00:00:00 2001 From: SpencerFleury <159941756+SpencerFleury@users.noreply.github.com> Date: Mon, 30 Sep 2024 11:45:57 -0700 Subject: [PATCH 1/9] Profiles article DOC-233; added info on Databricks ALSO @markzegarelli: I changed the slug on this from profile-properties to profiles. Can you set up a redirect? Thanks! --- .../collections/data/en/profile-properties.md | 101 -------------- content/collections/data/en/profiles.md | 126 ++++++++++++++++++ 2 files changed, 126 insertions(+), 101 deletions(-) delete mode 100644 content/collections/data/en/profile-properties.md create mode 100644 content/collections/data/en/profiles.md diff --git a/content/collections/data/en/profile-properties.md b/content/collections/data/en/profile-properties.md deleted file mode 100644 index a6fc2410f..000000000 --- a/content/collections/data/en/profile-properties.md +++ /dev/null @@ -1,101 +0,0 @@ ---- -id: 762c167a-caad-4a25-9758-3b35af55857d -blueprint: data -title: Profiles -landing: false -exclude_from_sitemap: false -updated_by: 5817a4fa-a771-417a-aa94-a0b1e7f55eae -updated_at: 1726776950 ---- -**Profiles** enable you to merge customer profile data from your data warehouse with existing behavioral product data already in Amplitude. - -Profiles act as standalone properties, in that they aren't associated with specific events and are instead associated with a user profile. In this way, they're different from traditional user properties. - -Profiles always display the most current data synced from your warehouse. - -## Setup - -To set up a profile in Amplitude, follow these steps: - -1. In Amplitude Data, navigate to *Connections > Catalog* and click the Snowflake tile. -2. In the *Set Up Connection* tab, connect Amplitude to your data warehouse by filling in all the relevant fields under *Snowflake Credentials*. You can either create a new connection, or reuse an existing one. Click *Next* when you're done. -3. In the *Verify Instrumentation* tab, follow the steps described in the [Snowflake Data Import guide](/docs/data/source-catalog/snowflake#add-snowflake-as-a-source). Click *Next* when you're done. -4. in the *Select Data* tab, select the `profiles` data type. Amplitude pre-selects the required change data capture import strategy for you, which you can see under the *Select Import Strategy* dropdown: - - * **Insert**: Always on, creates new profiles when added to your table. - * **Update**: Syncs changes to values from your table to Amplitude. - * **Delete**: Syncs deletions from your table to Amplitude. - -When you're done, click *Next* to move on to data mapping. - -{{partial:admonition type='note'}} -If this is the first time you're importing data from this table, set a data retention time and enable change tracking in Snowflake with the following commands: - -```sql -ALTER TABLE DATAPL_DB_STAG.PUBLIC.PROFILE_PROPERTIES_TABLE_1 SET DATA_RETENTION_TIME_IN_DAYS = 7; - -ALTER TABLE DATAPL_DB_STAG.PUBLIC.PROFILE_PROPERTIES_TABLE_1 SET CHANGE_TRACKING = TRUE; -``` -Snowflake Standard Edition plans have a maximum retention time of one day. -{{/partial:admonition}} - -5. You can see a list of your tables under *Select Table.* To begin column mapping, click the table you're interested in. -6. In the list of required fields under *Column Mapping,* enter the column names in the appropriate fields to match columns to required fields. To add more fields, click *+ Add field*. -7. When you're done, click *Test Mapping* to verify your mapping information. When you're ready, click *Next.* -8. Name the source and set the frequency at which Amplitude should refresh your profiles from the data warehouse. - -## Data specifications - -Profiles supports a maximum of 200 warehouse properties, and supports known Amplitude users. A `user_id` must go with each profile. - -| Field | Description | Example | -| ------------------- | ----------------------------------------------------------------------------------------------------------------------------- | ------------------------ | -| `user_id` | Identifier for the user. Must have a minimum length of 5. | -| `Profile Property 1` | Profile property set at the user level. The value of this field is the value from the customer’s source since last sync. | -| `Profile Property 2` | Profile property set at the user level. The value of this field is the value from the customer’s source since last sync. | - -Example: -```json -{ - "user_id": 12345, - "number of purchases": 10, - "title": "Data Engineer" -} -``` - -See [this article for information on Snowflake profiles](/docs/data/source-catalog/snowflake#profile-properties). - -## SQL template - -```sql -SELECT - AS "user_id", - AS "profile_property_1", - AS "profile_property_2" -FROM DATABASE_NAME.SCHEMA_NAME.TABLE_OR_VIEW_NAME -``` - -## Clear a profile value - -When you remove profile values in your data warehouse, those values sync to Amplitude during the next sync operation. You can also use Amplitude Data to remove unused property fields from users in Amplitude. - -## Sample queries - -```sql -SELECT - user_id as "user_id", - upgrade_propensity_score as "Upgrade Propensity Score", - user_model_version as "User Model Version" -FROM - ml_models.prod_propensity_scoring -``` - -```sql -SELECT - m.uid as "user_id", - m.title as "Title", - m.seniority as "Seniority", - m.dma as "DMA" -FROM - prod_users.demo_data m -``` \ No newline at end of file diff --git a/content/collections/data/en/profiles.md b/content/collections/data/en/profiles.md new file mode 100644 index 000000000..210b5899b --- /dev/null +++ b/content/collections/data/en/profiles.md @@ -0,0 +1,126 @@ +--- +id: 762c167a-caad-4a25-9758-3b35af55857d +blueprint: data +title: Profiles +landing: false +exclude_from_sitemap: false +updated_by: 5817a4fa-a771-417a-aa94-a0b1e7f55eae +updated_at: 1727721833 +--- +**Profiles** enable you to join customer profile data from your data warehouse with existing behavioral product data already in Amplitude. + +Profiles act as standalone properties, in that they aren't associated with specific events and are instead associated with a user profile. They're different from traditional user properties and offer the opportunity to conduct more expansive analyses. + +Profiles always display the most current data synced from your warehouse. + +## Before you begin + +### Snowflake users +If this is your first time importing data from this table, set a data retention time and enable change tracking in Snowflake with the following commands: + +``` +ALTER TABLE DATAPL_DB_STAG.PUBLIC.PROFILES_PROPERTIES_TABLE_1 SET DATA_RETENTION_TIME_IN_DAYS = 7; + +ALTER TABLE DATAPL_DB_STAG.PUBLIC.PROFILES_PROPERTIES_TABLE_1 SET CHANGE_TRACKING = TRUE; +``` +On Snowflake Standard Edition plans, the maximum retention time is one day. If you’re on this plan, you should set the frequency to 12 hours in later steps. + +### Databricks users +Follow these instructions to [enable change tracking](https://docs.databricks.com/en/delta/delta-change-data-feed.html#enable): + +* If you're working with a new table, set the table property delta.enableChangeDataFeed = true in the `CREATE TABLE` command: + `CREATE TABLE student (id INT, name STRING, age INT) TBLPROPERTIES (delta.enableChangeDataFeed = true)` + + Also set `spark.databricks.delta.properties.defaults.enableChangeDataFeed = true` for all new tables. + +* If you're working with an existing table, set the table property `delta.enableChangeDataFeed = true` in the `ALTER TABLE` command: + `ALTER TABLE myDeltaTable SET TBLPROPERTIES (delta.enableChangeDataFeed = true)` + +You also have to set a [data retention period](https://docs.databricks.com/en/delta/history.html#configure-data-retention-for-time-travel-queries). This must be at least one day, but in most cases you should set this period to seven days or longer. If your retention period is too short, the import process can fail. + +## Set up a profile (Snowflake users) +To set up a profile in Amplitude, follow these steps: + +1. In Amplitude Data, navigate to Connections Overview. Then in the Sources panel, click Add More. Scroll down until you find the Snowflake tile and click it. +2. In the *Set Up Connection* tab, connect Amplitude to your data warehouse by filling in all the relevant fields under *Snowflake Credentials*, which are outlined in the [Snowflake Data Import guide](/docs/data/source-catalog/snowflake#add-snowflake-as-a-source). You can either create a new connection, or reuse an existing one. Click *Next* when you're done. +3. You can see a list of your tables under *Select Table*. To begin column mapping, click the table you're interested in. +4. In the list of required fields under *Column Mapping*, enter the column names in the appropriate fields to match columns to required fields. To add more fields, click *+ Add field*. +5. in the *Select Data* tab, select the `profiles` data type. Amplitude pre-selects the required change data capture import strategy for you, which you can see under the *Select Import Strategy* dropdown: + + * **Insert**: Always on, creates new profiles when added to your table. + * **Update**: Syncs changes to values from your table to Amplitude. + * **Delete**: Syncs deletions from your table to Amplitude. + +6. When you're done, click *Test Mapping* verify your mapping information. Then click *Next*. +7. Name the source and set the frequency at which Amplitude should refresh your profiles from the data warehouse. You should set the frequency to 12 hours if you are on Snowflake Standard Edition. + +## Set up a profile (Databricks users) +To set up a profile in Amplitude, follow these steps: + +1. In Amplitude Data, navigate to *Connections Overview*. Then in the *Sources* panel, click Add More. Scroll down until you find the Databricks tile and click it. +2. In the *Set Up Connection* tab, connect Amplitude to your data warehouse. Have the following information ready: + * **Server hostname**: This is the hostname of your Databricks cluster. You can find it in your cluster configuration by navigating to *Advanced Options -> JDBC/ODBC -> Server Hostname*. + * **HTTP path**: This is the HTTP path of the cluster you would like to connect to. You can find it in your cluster configuration by navigating to *Advanced Options -> JDBC/ODBC -> HTTP Path*. + * **Personal access token**: Use the personal access token to authenticate with your Databricks cluster. [Learn how to create them here](https://docs.databricks.com/en/dev-tools/auth/index.html#common-tasks-for-databricks-authentication). + + Click Next when you're done. +3. You can see a list of your tables under *Select Table*. To begin column mapping, click the table you're interested in. +4. In the list of required fields under *Column Mapping*, enter the column names in the appropriate fields to match columns to required fields. To add more fields, click *+ Add field*. +5. In the *Data Selection* tab, select the `profiles` data type. +6. When you're done, click *Test Mapping* to verify your mapping information. Then click *Next*. +7. Name the source and set the frequency at which Amplitude should refresh your profiles from the data warehouse. The default frequency is 12 hours, but you can change it. + +## Data specifications +Profiles supports a maximum of 200 warehouse properties, and supports known Amplitude users. A `user_id` must go with each profile. + +| Field | Description | Example | +| ------------------- | ----------------------------------------------------------------------------------------------------------------------------- | ------------------------ | +| `user_id` | Identifier for the user. Must have a minimum length of 5. | +| `Profile Property 1` | Profile property set at the user level. The value of this field is the value from the customer’s source since last sync. | +| `Profile Property 2` | Profile property set at the user level. The value of this field is the value from the customer’s source since last sync. | + +Example: +```json +{ + "user_id": 12345, + "number of purchases": 10, + "title": "Data Engineer" +} +``` + +See [this article for information on Snowflake profiles](/docs/data/source-catalog/snowflake#profile-properties). + +## SQL template + +```sql +SELECT + AS "user_id", + AS "profile_property_1", + AS "profile_property_2" +FROM DATABASE_NAME.SCHEMA_NAME.TABLE_OR_VIEW_NAME +``` + +## Clear a profile value + +When you remove profile values in your data warehouse, those values sync to Amplitude during the next sync operation. You can also use Amplitude Data to remove unused property fields from users in Amplitude. + +## Sample queries + +```sql +SELECT + user_id as "user_id", + upgrade_propensity_score as "Upgrade Propensity Score", + user_model_version as "User Model Version" +FROM + ml_models.prod_propensity_scoring +``` + +```sql +SELECT + m.uid as "user_id", + m.title as "Title", + m.seniority as "Seniority", + m.dma as "DMA" +FROM + prod_users.demo_data m +``` \ No newline at end of file From 0ceae12ab6fdbd5a824cc2dd5a20d3808cf30048 Mon Sep 17 00:00:00 2001 From: markzegarelli Date: Mon, 30 Sep 2024 11:54:52 -0700 Subject: [PATCH 2/9] Add redirect --- content/trees/collections/en/data.yaml | 2 -- vercel.json | 5 +++++ 2 files changed, 5 insertions(+), 2 deletions(-) diff --git a/content/trees/collections/en/data.yaml b/content/trees/collections/en/data.yaml index fbb84227d..dab88c313 100644 --- a/content/trees/collections/en/data.yaml +++ b/content/trees/collections/en/data.yaml @@ -13,8 +13,6 @@ tree: entry: 1c4b9202-0063-4acf-9b99-66b73435630b - entry: 42e4e239-54c9-4ac4-9170-b535a1fd5eba - - - entry: 762c167a-caad-4a25-9758-3b35af55857d - entry: e738a4d5-a463-405e-aad0-665115a2b631 - diff --git a/vercel.json b/vercel.json index b942cdcb9..e0469959e 100644 --- a/vercel.json +++ b/vercel.json @@ -1,6 +1,11 @@ { "trailingSlash": false, "redirects": [ + { + "source": "/docs/data/profile-properties", + "destination": "/docs/data/profiles", + "statusCode": 301 + }, { "source": "/docs/hc/en-us/categories/5078631395227((?:-[a-zA-Z0-9-]+)?)", "destination": "/docs/data", From 41cedc10a0c9a4061000310a381b490a835a7b68 Mon Sep 17 00:00:00 2001 From: SpencerFleury <159941756+SpencerFleury@users.noreply.github.com> Date: Mon, 30 Sep 2024 11:59:55 -0700 Subject: [PATCH 3/9] Update data.yaml Nav entry added --- content/trees/navigation/en/data.yaml | 3 +++ 1 file changed, 3 insertions(+) diff --git a/content/trees/navigation/en/data.yaml b/content/trees/navigation/en/data.yaml index 5ef804960..65c93c573 100644 --- a/content/trees/navigation/en/data.yaml +++ b/content/trees/navigation/en/data.yaml @@ -204,6 +204,9 @@ tree: - id: eb2ff912-6151-4093-b5c0-61762b0f4fc5 entry: 69cefed6-2b87-4333-8cc6-ba5bac1b41e5 + - + id: d903dce8-3a04-4f78-a85b-5703d0e6ef29 + entry: 762c167a-caad-4a25-9758-3b35af55857d - id: 2d36d2b4-0d23-4bf0-9127-224901dfa27a entry: 1c4b9202-0063-4acf-9b99-66b73435630b From 9965bd2b6c91dbaeb9839062a928acb03b27bd12 Mon Sep 17 00:00:00 2001 From: SpencerFleury <159941756+SpencerFleury@users.noreply.github.com> Date: Mon, 30 Sep 2024 15:37:23 -0700 Subject: [PATCH 4/9] Update content/collections/data/en/profiles.md Co-authored-by: markzegarelli --- content/collections/data/en/profiles.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/collections/data/en/profiles.md b/content/collections/data/en/profiles.md index 210b5899b..f897cef20 100644 --- a/content/collections/data/en/profiles.md +++ b/content/collections/data/en/profiles.md @@ -18,7 +18,7 @@ Profiles always display the most current data synced from your warehouse. ### Snowflake users If this is your first time importing data from this table, set a data retention time and enable change tracking in Snowflake with the following commands: -``` +```sql ALTER TABLE DATAPL_DB_STAG.PUBLIC.PROFILES_PROPERTIES_TABLE_1 SET DATA_RETENTION_TIME_IN_DAYS = 7; ALTER TABLE DATAPL_DB_STAG.PUBLIC.PROFILES_PROPERTIES_TABLE_1 SET CHANGE_TRACKING = TRUE; From cf2a0d7137c754bdb4b2068ab602587f9ceeb762 Mon Sep 17 00:00:00 2001 From: SpencerFleury <159941756+SpencerFleury@users.noreply.github.com> Date: Mon, 30 Sep 2024 15:37:41 -0700 Subject: [PATCH 5/9] Update content/collections/data/en/profiles.md Co-authored-by: markzegarelli --- content/collections/data/en/profiles.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/collections/data/en/profiles.md b/content/collections/data/en/profiles.md index f897cef20..9758ed2f3 100644 --- a/content/collections/data/en/profiles.md +++ b/content/collections/data/en/profiles.md @@ -28,7 +28,7 @@ On Snowflake Standard Edition plans, the maximum retention time is one day. If y ### Databricks users Follow these instructions to [enable change tracking](https://docs.databricks.com/en/delta/delta-change-data-feed.html#enable): -* If you're working with a new table, set the table property delta.enableChangeDataFeed = true in the `CREATE TABLE` command: +* If you're working with a new table, set the table property `delta.enableChangeDataFeed = true` in the `CREATE TABLE` command: `CREATE TABLE student (id INT, name STRING, age INT) TBLPROPERTIES (delta.enableChangeDataFeed = true)` Also set `spark.databricks.delta.properties.defaults.enableChangeDataFeed = true` for all new tables. From 87a96dfb2327074c4bb8685e1f95f650071d12eb Mon Sep 17 00:00:00 2001 From: SpencerFleury <159941756+SpencerFleury@users.noreply.github.com> Date: Mon, 30 Sep 2024 15:37:58 -0700 Subject: [PATCH 6/9] Update content/collections/data/en/profiles.md Co-authored-by: markzegarelli --- content/collections/data/en/profiles.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/collections/data/en/profiles.md b/content/collections/data/en/profiles.md index 9758ed2f3..6f90fb4f3 100644 --- a/content/collections/data/en/profiles.md +++ b/content/collections/data/en/profiles.md @@ -36,7 +36,7 @@ Follow these instructions to [enable change tracking](https://docs.databricks.co * If you're working with an existing table, set the table property `delta.enableChangeDataFeed = true` in the `ALTER TABLE` command: `ALTER TABLE myDeltaTable SET TBLPROPERTIES (delta.enableChangeDataFeed = true)` -You also have to set a [data retention period](https://docs.databricks.com/en/delta/history.html#configure-data-retention-for-time-travel-queries). This must be at least one day, but in most cases you should set this period to seven days or longer. If your retention period is too short, the import process can fail. +Set a [data retention period](https://docs.databricks.com/en/delta/history.html#configure-data-retention-for-time-travel-queries). This must be at least one day, but in most cases you should set this period to seven days or longer. If your retention period is too short, the import process can fail. ## Set up a profile (Snowflake users) To set up a profile in Amplitude, follow these steps: From aff2f3ca512cfc53b6a175b834c1a94a61b87c98 Mon Sep 17 00:00:00 2001 From: SpencerFleury <159941756+SpencerFleury@users.noreply.github.com> Date: Mon, 30 Sep 2024 15:38:46 -0700 Subject: [PATCH 7/9] Update content/collections/data/en/profiles.md Co-authored-by: markzegarelli --- content/collections/data/en/profiles.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/collections/data/en/profiles.md b/content/collections/data/en/profiles.md index 6f90fb4f3..68552e516 100644 --- a/content/collections/data/en/profiles.md +++ b/content/collections/data/en/profiles.md @@ -45,7 +45,7 @@ To set up a profile in Amplitude, follow these steps: 2. In the *Set Up Connection* tab, connect Amplitude to your data warehouse by filling in all the relevant fields under *Snowflake Credentials*, which are outlined in the [Snowflake Data Import guide](/docs/data/source-catalog/snowflake#add-snowflake-as-a-source). You can either create a new connection, or reuse an existing one. Click *Next* when you're done. 3. You can see a list of your tables under *Select Table*. To begin column mapping, click the table you're interested in. 4. In the list of required fields under *Column Mapping*, enter the column names in the appropriate fields to match columns to required fields. To add more fields, click *+ Add field*. -5. in the *Select Data* tab, select the `profiles` data type. Amplitude pre-selects the required change data capture import strategy for you, which you can see under the *Select Import Strategy* dropdown: +5. On the *Select Data* tab, select the `profiles` data type. Amplitude pre-selects the required change data capture import strategy for you, which you can see under the *Select Import Strategy* dropdown: * **Insert**: Always on, creates new profiles when added to your table. * **Update**: Syncs changes to values from your table to Amplitude. From 2279e746b5527ce41486d9f6c40c995b7ca0a7a3 Mon Sep 17 00:00:00 2001 From: SpencerFleury <159941756+SpencerFleury@users.noreply.github.com> Date: Mon, 30 Sep 2024 15:39:14 -0700 Subject: [PATCH 8/9] Update content/collections/data/en/profiles.md Co-authored-by: markzegarelli --- content/collections/data/en/profiles.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/collections/data/en/profiles.md b/content/collections/data/en/profiles.md index 68552e516..5b6c98d01 100644 --- a/content/collections/data/en/profiles.md +++ b/content/collections/data/en/profiles.md @@ -42,7 +42,7 @@ Set a [data retention period](https://docs.databricks.com/en/delta/history.html# To set up a profile in Amplitude, follow these steps: 1. In Amplitude Data, navigate to Connections Overview. Then in the Sources panel, click Add More. Scroll down until you find the Snowflake tile and click it. -2. In the *Set Up Connection* tab, connect Amplitude to your data warehouse by filling in all the relevant fields under *Snowflake Credentials*, which are outlined in the [Snowflake Data Import guide](/docs/data/source-catalog/snowflake#add-snowflake-as-a-source). You can either create a new connection, or reuse an existing one. Click *Next* when you're done. +2. On the *Set Up Connection* tab, connect Amplitude to your data warehouse by filling in all the relevant fields under *Snowflake Credentials*, which are outlined in the [Snowflake Data Import guide](/docs/data/source-catalog/snowflake#add-snowflake-as-a-source). You can either create a new connection, or reuse an existing one. Click *Next* when you're done. 3. You can see a list of your tables under *Select Table*. To begin column mapping, click the table you're interested in. 4. In the list of required fields under *Column Mapping*, enter the column names in the appropriate fields to match columns to required fields. To add more fields, click *+ Add field*. 5. On the *Select Data* tab, select the `profiles` data type. Amplitude pre-selects the required change data capture import strategy for you, which you can see under the *Select Import Strategy* dropdown: From 9a664ab5409b8fea1ac2d13fe3f967abd5830fae Mon Sep 17 00:00:00 2001 From: SpencerFleury <159941756+SpencerFleury@users.noreply.github.com> Date: Mon, 30 Sep 2024 15:40:06 -0700 Subject: [PATCH 9/9] Update content/collections/data/en/profiles.md --- content/collections/data/en/profiles.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/collections/data/en/profiles.md b/content/collections/data/en/profiles.md index 5b6c98d01..78a70e25a 100644 --- a/content/collections/data/en/profiles.md +++ b/content/collections/data/en/profiles.md @@ -41,7 +41,7 @@ Set a [data retention period](https://docs.databricks.com/en/delta/history.html# ## Set up a profile (Snowflake users) To set up a profile in Amplitude, follow these steps: -1. In Amplitude Data, navigate to Connections Overview. Then in the Sources panel, click Add More. Scroll down until you find the Snowflake tile and click it. +1. In Amplitude Data, navigate to *Connections Overview*. Then in the *Sources panel*, click *Add More*. Scroll down until you find the Snowflake tile and click it. 2. On the *Set Up Connection* tab, connect Amplitude to your data warehouse by filling in all the relevant fields under *Snowflake Credentials*, which are outlined in the [Snowflake Data Import guide](/docs/data/source-catalog/snowflake#add-snowflake-as-a-source). You can either create a new connection, or reuse an existing one. Click *Next* when you're done. 3. You can see a list of your tables under *Select Table*. To begin column mapping, click the table you're interested in. 4. In the list of required fields under *Column Mapping*, enter the column names in the appropriate fields to match columns to required fields. To add more fields, click *+ Add field*.