Skip to content

Commit

Permalink
Bug/renamed columns option3 (#48)
Browse files Browse the repository at this point in the history
* bug/renamed-columns

* add macro file

* add casting

* update yml

* add casting for all

* macro updates

* bug/renamed-columns-dynamic

* adjustments

* add consistency tests

* update yml

* bug/renamed-columns-option3

* rename variables for readability

* adjust casts

* update yml

* update version & changelog

* update variable for readbility

* small adjustment

* revise consistency test

* update changelog

* update test

* update changelog

* update test & macro

* update changelog

* make breaking

* regen docs

* update packages

* update test

* update comments
  • Loading branch information
fivetran-catfritz authored Jul 15, 2024
1 parent b41bbcc commit 197838b
Show file tree
Hide file tree
Showing 35 changed files with 542 additions and 382 deletions.
16 changes: 16 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,19 @@
# dbt_salesforce_source v1.1.0
[PR #48](https://github.com/fivetran/dbt_salesforce_source/pull/48) includes the following updates:

## 🚨 Breaking Change 🚨
- Added logic to support user-specified scenarios where the Fivetran Salesforce connector syncs column names using the original Salesforce API naming convention. For example, while Fivetran typically provides the column as `created_date`, some users might choose to receive it as `CreatedDate` according to the API naming. This update ensures the package is automatically compatible with both naming conventions.
- Specifically, the package now performs a COALESCE, preferring the original Salesforce API naming. If the original naming is not present, the Fivetran version is used instead.
- Renamed columns are now explicitly cast to prevent conflicts during the COALESCE.
- ❗This change is considered breaking since the resulting column types may differ from prior versions of this package.

## Under the Hood
- Added the following macros to support the mentioned bug fix logic:
- `add_renamed_columns`: Determines the original names for each column and adds them to the list generated within each `get_*_columns` macro. By default, this macro processes column names by removing underscores and capitalizing each part that follows an underscore. This ensures all necessary columns are available for use in the `coalesce_rename` macro. Additionally, this macro tags each column with its renamed version to maintain tracking.
- `column_list_to_dict`: Converts the list of dictionaries generated by the `get_*_columns` macros into a dictionary of dictionaries for use in the `coalesce_rename` macro. This conversion is necessary so that each column dictionary entry can be accessed by a key, rather than iterating through a list.
- `coalesce_rename`: Utilizes the dictionary generated by `column_list_to_dict` to coalesce a column with its renamed counterpart, producing the final column. This macro also allows for the passing of a custom renamed spelling, datatype, and alias as arguments to override default values.
- Added validation test to ensure the final column names generated before and after this update remain the same.

# dbt_salesforce_source v1.0.1

[PR #44](https://github.com/fivetran/dbt_salesforce_source/pull/44) includes the following updates:
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ If you are **not** using the [Salesforce transformation package](https://github.
```yaml
packages:
- package: fivetran/salesforce_source
version: [">=1.0.0", "<1.1.0"] # we recommend using ranges to capture non-breaking changes automatically
version: [">=1.1.0", "<1.2.0"] # we recommend using ranges to capture non-breaking changes automatically
```
## Step 3: Configure Your Variables
### Database and Schema Variables (Using the standard Salesforce schema only)
Expand Down
2 changes: 1 addition & 1 deletion dbt_project.yml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
config-version: 2
name: 'salesforce_source'
version: '1.0.1'
version: '1.1.0'
require-dbt-version: [">=1.3.0", "<2.0.0"]
models:
salesforce_source:
Expand Down
2 changes: 1 addition & 1 deletion docs/catalog.json

Large diffs are not rendered by default.

47 changes: 10 additions & 37 deletions docs/index.html

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion docs/manifest.json

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion docs/run_results.json

Large diffs are not rendered by default.

3 changes: 2 additions & 1 deletion integration_tests/.gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,5 @@
target/
dbt_modules/
logs/
env/
env/
package-lock.yml
5 changes: 4 additions & 1 deletion integration_tests/dbt_project.yml
Original file line number Diff line number Diff line change
@@ -1,10 +1,13 @@
name: 'salesforce_source_integration_tests'
version: '1.0.1'
version: '1.1.0'

config-version: 2

profile: 'integration_tests'

models:
+schema: "salesforce_source_{{ var('directed_schema','dev') }}"

vars:
salesforce_source:
salesforce_schema: salesforce_source_integrations_tests_3
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
{{ config(
tags="fivetran_validations",
enabled=var('fivetran_validation_tests_enabled', false)
) }}

/* This test is to make sure the final columns produced are the same between versions.
Only one test is needed since it will fetch all tables and all columns in each schema.
!!! THIS TEST IS WRITTEN FOR BIGQUERY!!! */
{% if target.type == 'bigquery' %}
with prod as (
select
table_name,
column_name,
data_type
from {{ target.schema }}_salesforce_source_prod.INFORMATION_SCHEMA.COLUMNS
where table_name like 'stg_%'
),

dev as (
select
table_name,
column_name,
data_type
from {{ target.schema }}_salesforce_source_dev.INFORMATION_SCHEMA.COLUMNS
where table_name like 'stg_%'
),

prod_not_in_dev as (
-- rows from prod not found in dev
select * from prod
except distinct
select * from dev
),

dev_not_in_prod as (
-- rows from dev not found in prod
select * from dev
except distinct
select * from prod
),

final as (
select
*,
'from prod' as source
from prod_not_in_dev

union all -- union since we only care if rows are produced

select
*,
'from dev' as source
from dev_not_in_prod
)

select *
from final

{% else %}
{{ print('This is written to run on bigquery. If you need to run on another warehouse, add a version!') }}

{% endif %}
24 changes: 24 additions & 0 deletions macros/add_renamed_columns.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
{% macro add_renamed_columns(column_list) %}
{# This macro determines the original names for each column and adds them to the list generated within each `get_*_columns` macro. By default, this macro processes column names by removing underscores and capitalizing each part that follows an underscore. This ensures all necessary columns are available for use in the `coalesce_rename` macro. Additionally, this macro tags each column with its renamed version to maintain tracking. #}

{%- set renamed_columns = [] %}

{%- for col in column_list %}

{%- set original_column_name = col.name %}

{%- if 'fivetran' not in original_column_name %}
{# Use renamed_column_name value if it provided in the get_columns macro #}
{%- set renamed_column_name = col.renamed_column_name | default(original_column_name.split('_') | map('capitalize') | join('')) %}

{# Add an entry to the list of renames to populate the filled columns if the rename is different #}
{%- do renamed_columns.append({"name": renamed_column_name, "datatype": col.datatype, "is_rename": true}) if renamed_column_name|lower != original_column_name|lower %}

{# Update the original column with the renamed column name for use later. #}
{%- set col = col.update({ "renamed_column_name": renamed_column_name, "is_rename": false}) %}
{%- endif %}
{%- endfor %}

{%- do column_list.extend(renamed_columns) %}

{% endmacro %}
21 changes: 21 additions & 0 deletions macros/coalesce_rename.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
{% macro coalesce_rename(
column_key,
column_dict,
original_column_name=column_dict[column_key]["name"],
datatype=column_dict[column_key]["datatype"],
alias=column_dict[column_key]["alias"] | default(original_column_name),
renamed_column_name=column_dict[column_key]["renamed_column_name"]
) %}

{# This macro accomodates Fivetran connectors that keep the original salesforce field naming conventions without underscores #}
{# Utilizes the dictionary generated by `column_list_to_dict` to coalesce a column with its renamed counterpart, producing the final column. This macro also allows for the passing of a custom renamed spelling, datatype, and alias as arguments to override default values. #}
{%- if original_column_name|lower == renamed_column_name|lower %}
cast({{ renamed_column_name }} as {{ datatype }}) as {{ alias }}

{%- else %}
coalesce(cast({{ renamed_column_name }} as {{ datatype }}),
cast({{ original_column_name }} as {{ datatype }}))
as {{ alias }}

{%- endif %}
{%- endmacro %}
9 changes: 9 additions & 0 deletions macros/column_list_to_dict.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
{% macro column_list_to_dict(column_list) %}
{# This macro converts the list of dictionaries generated by the `get_*_columns` macros into a dictionary of dictionaries for use in the `coalesce_rename` macro. This conversion is necessary so that each column dictionary entry can be accessed by a key, rather than iterating through a list. #}
{%- set column_dict = {} -%}
{%- for col in column_list -%}
{%- do column_dict.update({col.name: col}) if not col.is_rename -%}
{%- endfor -%}
{{ return(column_dict) }}

{% endmacro %}
6 changes: 4 additions & 2 deletions macros/get_account_columns.sql
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
{% set columns = [

{"name": "_fivetran_synced", "datatype": dbt.type_timestamp()},
{"name": "_fivetran_active", "datatype": "boolean"},
{"name": "_fivetran_active", "datatype": dbt.type_boolean()},
{"name": "account_number", "datatype": dbt.type_string()},
{"name": "account_source", "datatype": dbt.type_string()},
{"name": "annual_revenue", "datatype": dbt.type_float()},
Expand All @@ -16,7 +16,7 @@
{"name": "description", "datatype": dbt.type_string()},
{"name": "id", "datatype": dbt.type_string()},
{"name": "industry", "datatype": dbt.type_string()},
{"name": "is_deleted", "datatype": "boolean"},
{"name": "is_deleted", "datatype": dbt.type_boolean()},
{"name": "last_activity_date", "datatype": dbt.type_timestamp()},
{"name": "last_referenced_date", "datatype": dbt.type_timestamp()},
{"name": "last_viewed_date", "datatype": dbt.type_timestamp()},
Expand All @@ -39,6 +39,8 @@
{"name": "website", "datatype": dbt.type_string()}
] %}

{{ salesforce_source.add_renamed_columns(columns) }}

{{ fivetran_utils.add_pass_through_columns(columns, var('salesforce__account_pass_through_columns')) }}

{{ return(columns) }}
Expand Down
4 changes: 3 additions & 1 deletion macros/get_contact_columns.sql
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

{% set columns = [
{"name": "_fivetran_synced", "datatype": dbt.type_timestamp()},
{"name": "_fivetran_active", "datatype": "boolean"},
{"name": "_fivetran_active", "datatype": dbt.type_boolean()},
{"name": "account_id", "datatype": dbt.type_string()},
{"name": "department", "datatype": dbt.type_string()},
{"name": "description", "datatype": dbt.type_string()},
Expand Down Expand Up @@ -35,6 +35,8 @@
{"name": "title", "datatype": dbt.type_string()},
] %}

{{ salesforce_source.add_renamed_columns(columns) }}

{{ fivetran_utils.add_pass_through_columns(columns, var('salesforce__contact_pass_through_columns')) }}

{{ return(columns) }}
Expand Down
4 changes: 3 additions & 1 deletion macros/get_event_columns.sql
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

{% set columns = [
{"name": "_fivetran_synced", "datatype": dbt.type_timestamp()},
{"name": "_fivetran_active", "datatype": "boolean"},
{"name": "_fivetran_active", "datatype": dbt.type_boolean()},
{"name": "account_id", "datatype": dbt.type_string()},
{"name": "activity_date", "datatype": dbt.type_timestamp()},
{"name": "activity_date_time", "datatype": dbt.type_timestamp()},
Expand Down Expand Up @@ -32,6 +32,8 @@
{"name": "who_id", "datatype": dbt.type_string()}
] %}

{{ salesforce_source.add_renamed_columns(columns) }}

{{ fivetran_utils.add_pass_through_columns(columns, var('salesforce__event_pass_through_columns')) }}

{{ return(columns) }}
Expand Down
4 changes: 3 additions & 1 deletion macros/get_lead_columns.sql
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

{% set columns = [
{"name": "_fivetran_synced", "datatype": dbt.type_timestamp()},
{"name": "_fivetran_active", "datatype": "boolean"},
{"name": "_fivetran_active", "datatype": dbt.type_boolean()},
{"name": "annual_revenue", "datatype": dbt.type_float()},
{"name": "city", "datatype": dbt.type_string()},
{"name": "company", "datatype": dbt.type_string()},
Expand Down Expand Up @@ -48,6 +48,8 @@
{"name": "website", "datatype": dbt.type_string()},
] %}

{{ salesforce_source.add_renamed_columns(columns) }}

{{ fivetran_utils.add_pass_through_columns(columns, var('salesforce__lead_pass_through_columns')) }}

{{ return(columns) }}
Expand Down
4 changes: 3 additions & 1 deletion macros/get_opportunity_columns.sql
Original file line number Diff line number Diff line change
Expand Up @@ -34,9 +34,11 @@
{"name": "record_type_id", "datatype": dbt.type_string()},
{"name": "stage_name", "datatype": dbt.type_string()},
{"name": "synced_quote_id", "datatype": dbt.type_string()},
{"name": "type", "datatype": dbt.type_string()},
{"name": "type", "datatype": dbt.type_string()}
] %}

{{ salesforce_source.add_renamed_columns(columns) }}

{{ fivetran_utils.add_pass_through_columns(columns, var('salesforce__opportunity_pass_through_columns')) }}

{{ return(columns) }}
Expand Down
4 changes: 3 additions & 1 deletion macros/get_opportunity_line_item_columns.sql
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

{% set columns = [
{"name": "_fivetran_synced", "datatype": dbt.type_timestamp()},
{"name": "_fivetran_active", "datatype": "boolean"},
{"name": "_fivetran_active", "datatype": dbt.type_boolean()},
{"name": "created_by_id", "datatype": dbt.type_string()},
{"name": "created_date", "datatype": dbt.type_timestamp()},
{"name": "description", "datatype": dbt.type_string()},
Expand All @@ -29,6 +29,8 @@
{"name": "unit_price", "datatype": dbt.type_float()}
] %}

{{ salesforce_source.add_renamed_columns(columns) }}

{{ fivetran_utils.add_pass_through_columns(columns, var('salesforce__opportunity_line_item_pass_through_columns')) }}

{{ return(columns) }}
Expand Down
4 changes: 3 additions & 1 deletion macros/get_order_columns.sql
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

{% set columns = [
{"name": "_fivetran_synced", "datatype": dbt.type_timestamp()},
{"name": "_fivetran_active", "datatype": "boolean"},
{"name": "_fivetran_active", "datatype": dbt.type_boolean()},
{"name": "account_id", "datatype": dbt.type_string()},
{"name": "activated_by_id", "datatype": dbt.type_string()},
{"name": "activated_date", "datatype": dbt.type_timestamp()},
Expand Down Expand Up @@ -41,6 +41,8 @@
{"name": "type", "datatype": dbt.type_string()},
] %}

{{ salesforce_source.add_renamed_columns(columns) }}

{{ fivetran_utils.add_pass_through_columns(columns, var('salesforce__order_pass_through_columns')) }}

{{ return(columns) }}
Expand Down
4 changes: 3 additions & 1 deletion macros/get_product_2_columns.sql
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

{% set columns = [
{"name": "_fivetran_synced", "datatype": dbt.type_timestamp()},
{"name": "_fivetran_active", "datatype": "boolean"},
{"name": "_fivetran_active", "datatype": dbt.type_boolean()},
{"name": "created_by_id", "datatype": dbt.type_string()},
{"name": "created_date", "datatype": dbt.type_timestamp()},
{"name": "description", "datatype": dbt.type_string()},
Expand All @@ -29,6 +29,8 @@
{"name": "revenue_schedule_type", "datatype": dbt.type_string()},
] %}

{{ salesforce_source.add_renamed_columns(columns) }}

{{ fivetran_utils.add_pass_through_columns(columns, var('salesforce__product_2_pass_through_columns')) }}

{{ return(columns) }}
Expand Down
4 changes: 3 additions & 1 deletion macros/get_task_columns.sql
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

{% set columns = [
{"name": "_fivetran_synced", "datatype": dbt.type_timestamp()},
{"name": "_fivetran_active", "datatype": "boolean"},
{"name": "_fivetran_active", "datatype": dbt.type_boolean()},
{"name": "account_id", "datatype": dbt.type_string()},
{"name": "activity_date", "datatype": dbt.type_timestamp()},
{"name": "call_disposition", "datatype": dbt.type_string()},
Expand Down Expand Up @@ -33,6 +33,8 @@
{"name": "who_id", "datatype": dbt.type_string()}
] %}

{{ salesforce_source.add_renamed_columns(columns) }}

{{ fivetran_utils.add_pass_through_columns(columns, var('salesforce__task_pass_through_columns')) }}

{{ return(columns) }}
Expand Down
4 changes: 3 additions & 1 deletion macros/get_user_columns.sql
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

{% set columns = [
{"name": "_fivetran_deleted", "datatype": "boolean"},
{"name": "_fivetran_active", "datatype": "boolean"},
{"name": "_fivetran_active", "datatype": dbt.type_boolean()},
{"name": "_fivetran_synced", "datatype": dbt.type_timestamp()},
{"name": "account_id", "datatype": dbt.type_string()},
{"name": "alias", "datatype": dbt.type_string()},
Expand Down Expand Up @@ -34,6 +34,8 @@
{"name": "username", "datatype": dbt.type_string()},
] %}

{{ salesforce_source.add_renamed_columns(columns) }}

{{ fivetran_utils.add_pass_through_columns(columns, var('salesforce__user_pass_through_columns')) }}

{{ return(columns) }}
Expand Down
4 changes: 3 additions & 1 deletion macros/get_user_role_columns.sql
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

{% set columns = [
{"name": "_fivetran_deleted", "datatype": "boolean"},
{"name": "_fivetran_active", "datatype": "boolean"},
{"name": "_fivetran_active", "datatype": dbt.type_boolean()},
{"name": "_fivetran_synced", "datatype": dbt.type_timestamp()},
{"name": "developer_name", "datatype": dbt.type_string()},
{"name": "id", "datatype": dbt.type_string()},
Expand All @@ -12,6 +12,8 @@
{"name": "rollup_description", "datatype": dbt.type_string()}
] %}

{{ salesforce_source.add_renamed_columns(columns) }}

{{ fivetran_utils.add_pass_through_columns(columns, var('salesforce__user_role_pass_through_columns')) }}

{{ return(columns) }}
Expand Down
Loading

0 comments on commit 197838b

Please sign in to comment.