Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Profiles article #306

Merged
merged 10 commits into from
Sep 30, 2024
101 changes: 0 additions & 101 deletions content/collections/data/en/profile-properties.md

This file was deleted.

126 changes: 126 additions & 0 deletions content/collections/data/en/profiles.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,126 @@
---
id: 762c167a-caad-4a25-9758-3b35af55857d
blueprint: data
title: Profiles
landing: false
exclude_from_sitemap: false
updated_by: 5817a4fa-a771-417a-aa94-a0b1e7f55eae
updated_at: 1727721833
---
**Profiles** enable you to join customer profile data from your data warehouse with existing behavioral product data already in Amplitude.

Profiles act as standalone properties, in that they aren't associated with specific events and are instead associated with a user profile. They're different from traditional user properties and offer the opportunity to conduct more expansive analyses.

Profiles always display the most current data synced from your warehouse.

## Before you begin

### Snowflake users
If this is your first time importing data from this table, set a data retention time and enable change tracking in Snowflake with the following commands:

```sql
ALTER TABLE DATAPL_DB_STAG.PUBLIC.PROFILES_PROPERTIES_TABLE_1 SET DATA_RETENTION_TIME_IN_DAYS = 7;

ALTER TABLE DATAPL_DB_STAG.PUBLIC.PROFILES_PROPERTIES_TABLE_1 SET CHANGE_TRACKING = TRUE;
```
On Snowflake Standard Edition plans, the maximum retention time is one day. If you’re on this plan, you should set the frequency to 12 hours in later steps.

### Databricks users
Follow these instructions to [enable change tracking](https://docs.databricks.com/en/delta/delta-change-data-feed.html#enable):

* If you're working with a new table, set the table property `delta.enableChangeDataFeed = true` in the `CREATE TABLE` command:
`CREATE TABLE student (id INT, name STRING, age INT) TBLPROPERTIES (delta.enableChangeDataFeed = true)`

Also set `spark.databricks.delta.properties.defaults.enableChangeDataFeed = true` for all new tables.

* If you're working with an existing table, set the table property `delta.enableChangeDataFeed = true` in the `ALTER TABLE` command:
`ALTER TABLE myDeltaTable SET TBLPROPERTIES (delta.enableChangeDataFeed = true)`

Set a [data retention period](https://docs.databricks.com/en/delta/history.html#configure-data-retention-for-time-travel-queries). This must be at least one day, but in most cases you should set this period to seven days or longer. If your retention period is too short, the import process can fail.

Check warning on line 39 in content/collections/data/en/profiles.md

View workflow job for this annotation

GitHub Actions / runner / vale

[vale] reported by reviewdog 🐶 [Amplitude.Wordiness] Consider using 'usually' instead of 'in most cases'. Raw Output: {"message": "[Amplitude.Wordiness] Consider using 'usually' instead of 'in most cases'.", "location": {"path": "content/collections/data/en/profiles.md", "range": {"start": {"line": 39, "column": 167}}}, "severity": "WARNING"}

## Set up a profile (Snowflake users)
To set up a profile in Amplitude, follow these steps:

1. In Amplitude Data, navigate to *Connections Overview*. Then in the *Sources panel*, click *Add More*. Scroll down until you find the Snowflake tile and click it.
2. On the *Set Up Connection* tab, connect Amplitude to your data warehouse by filling in all the relevant fields under *Snowflake Credentials*, which are outlined in the [Snowflake Data Import guide](/docs/data/source-catalog/snowflake#add-snowflake-as-a-source). You can either create a new connection, or reuse an existing one. Click *Next* when you're done.

Check warning on line 45 in content/collections/data/en/profiles.md

View workflow job for this annotation

GitHub Actions / runner / vale

[vale] reported by reviewdog 🐶 [Amplitude.Passive] 'are outlined' looks like passive voice. Raw Output: {"message": "[Amplitude.Passive] 'are outlined' looks like passive voice.", "location": {"path": "content/collections/data/en/profiles.md", "range": {"start": {"line": 45, "column": 152}}}, "severity": "WARNING"}
3. You can see a list of your tables under *Select Table*. To begin column mapping, click the table you're interested in.
4. In the list of required fields under *Column Mapping*, enter the column names in the appropriate fields to match columns to required fields. To add more fields, click *+ Add field*.
5. On the *Select Data* tab, select the `profiles` data type. Amplitude pre-selects the required change data capture import strategy for you, which you can see under the *Select Import Strategy* dropdown:

* **Insert**: Always on, creates new profiles when added to your table.
* **Update**: Syncs changes to values from your table to Amplitude.
* **Delete**: Syncs deletions from your table to Amplitude.

6. When you're done, click *Test Mapping* verify your mapping information. Then click *Next*.
7. Name the source and set the frequency at which Amplitude should refresh your profiles from the data warehouse. You should set the frequency to 12 hours if you are on Snowflake Standard Edition.

## Set up a profile (Databricks users)
To set up a profile in Amplitude, follow these steps:

1. In Amplitude Data, navigate to *Connections Overview*. Then in the *Sources* panel, click Add More. Scroll down until you find the Databricks tile and click it.
2. In the *Set Up Connection* tab, connect Amplitude to your data warehouse. Have the following information ready:
* **Server hostname**: This is the hostname of your Databricks cluster. You can find it in your cluster configuration by navigating to *Advanced Options -> JDBC/ODBC -> Server Hostname*.
* **HTTP path**: This is the HTTP path of the cluster you would like to connect to. You can find it in your cluster configuration by navigating to *Advanced Options -> JDBC/ODBC -> HTTP Path*.
* **Personal access token**: Use the personal access token to authenticate with your Databricks cluster. [Learn how to create them here](https://docs.databricks.com/en/dev-tools/auth/index.html#common-tasks-for-databricks-authentication).

Click Next when you're done.
3. You can see a list of your tables under *Select Table*. To begin column mapping, click the table you're interested in.
4. In the list of required fields under *Column Mapping*, enter the column names in the appropriate fields to match columns to required fields. To add more fields, click *+ Add field*.
5. In the *Data Selection* tab, select the `profiles` data type.
6. When you're done, click *Test Mapping* to verify your mapping information. Then click *Next*.
7. Name the source and set the frequency at which Amplitude should refresh your profiles from the data warehouse. The default frequency is 12 hours, but you can change it.

## Data specifications
Profiles supports a maximum of 200 warehouse properties, and supports known Amplitude users. A `user_id` must go with each profile.

| Field | Description | Example |
| ------------------- | ----------------------------------------------------------------------------------------------------------------------------- | ------------------------ |
| `user_id` | Identifier for the user. Must have a minimum length of 5. |
| `Profile Property 1` | Profile property set at the user level. The value of this field is the value from the customer’s source since last sync. |
| `Profile Property 2` | Profile property set at the user level. The value of this field is the value from the customer’s source since last sync. |

Example:
```json
{
"user_id": 12345,
"number of purchases": 10,
"title": "Data Engineer"
}
```

See [this article for information on Snowflake profiles](/docs/data/source-catalog/snowflake#profile-properties).

## SQL template

Check warning on line 93 in content/collections/data/en/profiles.md

View workflow job for this annotation

GitHub Actions / runner / vale

[vale] reported by reviewdog 🐶 [Amplitude.Headings] 'SQL template' should use sentence-style capitalization. Raw Output: {"message": "[Amplitude.Headings] 'SQL template' should use sentence-style capitalization.", "location": {"path": "content/collections/data/en/profiles.md", "range": {"start": {"line": 93, "column": 4}}}, "severity": "WARNING"}

```sql
SELECT
AS "user_id",
AS "profile_property_1",
AS "profile_property_2"
FROM DATABASE_NAME.SCHEMA_NAME.TABLE_OR_VIEW_NAME
```

## Clear a profile value

When you remove profile values in your data warehouse, those values sync to Amplitude during the next sync operation. You can also use Amplitude Data to remove unused property fields from users in Amplitude.

## Sample queries

```sql
SELECT
user_id as "user_id",
upgrade_propensity_score as "Upgrade Propensity Score",
user_model_version as "User Model Version"
FROM
ml_models.prod_propensity_scoring
```

```sql
SELECT
m.uid as "user_id",
m.title as "Title",
m.seniority as "Seniority",
m.dma as "DMA"
FROM
prod_users.demo_data m
```
2 changes: 0 additions & 2 deletions content/trees/collections/en/data.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -13,8 +13,6 @@ tree:
entry: 1c4b9202-0063-4acf-9b99-66b73435630b
-
entry: 42e4e239-54c9-4ac4-9170-b535a1fd5eba
-
entry: 762c167a-caad-4a25-9758-3b35af55857d
-
entry: e738a4d5-a463-405e-aad0-665115a2b631
-
Expand Down
3 changes: 3 additions & 0 deletions content/trees/navigation/en/data.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -204,6 +204,9 @@ tree:
-
id: eb2ff912-6151-4093-b5c0-61762b0f4fc5
entry: 69cefed6-2b87-4333-8cc6-ba5bac1b41e5
-
id: d903dce8-3a04-4f78-a85b-5703d0e6ef29
entry: 762c167a-caad-4a25-9758-3b35af55857d
-
id: 2d36d2b4-0d23-4bf0-9127-224901dfa27a
entry: 1c4b9202-0063-4acf-9b99-66b73435630b
Expand Down
5 changes: 5 additions & 0 deletions vercel.json
Original file line number Diff line number Diff line change
@@ -1,6 +1,11 @@
{
"trailingSlash": false,
"redirects": [
{
"source": "/docs/data/profile-properties",
"destination": "/docs/data/profiles",
"statusCode": 301
},
{
"source": "/docs/hc/en-us/categories/5078631395227((?:-[a-zA-Z0-9-]+)?)",
"destination": "/docs/data",
Expand Down
Loading