Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] column_list_to_dict macro will override the column's metadata if you pass through a column that is already selected from the source. #50

Open
2 of 4 tasks
tunguyensinh opened this issue Nov 18, 2024 · 5 comments
Assignees
Labels
type:wontfix This will not be worked on

Comments

@tunguyensinh
Copy link

Is there an existing issue for this?

  • I have searched the existing issues

Describe the issue

For example, if we configure pass-through column email with a different alias

salesforce__lead_pass_through_columns: # pass columns from lead source
  - name: "email"
    alias: "internal_email"
    renamed_column_name: "Email"
    transform_sql: "case when email like '%domain%'  then true else false end"

This line call column_list_to_dict macro, that will override the existing email's metadata, therefore, cause renamed_column_name field empty, result in invalid SQL syntax (rendered by salesforce_source.coalesce_rename macro):

with fields as (
select ...
),
final as (
select
...
    coalesce(cast( as TEXT),
        cast(email as TEXT))
        as internal_email,
from fields
)
select *
from final
where not coalesce(is_deleted, false)
  );

Relevant error log or model output

syntax error at or near "as" in context "(cast( as", at line 793, column 20
  compiled code at target/run/salesforce_source/models/salesforce/stg_salesforce__lead.sql

Expected behavior

Should not override metadata of existing column,

expected:

    coalesce(cast(Email as TEXT),
        cast(email as TEXT))
        as email,

Possible solution

Add the field {is_rename: True} for pass-through columns in get_*_columns.sql or within the fivetran_utils.add_pass_through_columns macro so that the macro column_list_to_dict does not override them. Keen to hear your suggestions.

dbt Project configurations

salesforce__lead_pass_through_columns: # pass columns from lead source
  - name: "email"
    alias: "internal_email"
    renamed_column_name: "Email"
    transform_sql: "case when email like '%domain%'  then true else false end"

Package versions

">=1.1.0", "<1.2.0"

What database are you using dbt with?

redshift

How are you running this dbt package?

dbt Core™

dbt Version

1.8

Additional Context

No response

Are you willing to open a PR to help address this issue?

  • Yes.
  • Yes, but I will need assistance.
  • No.
@tunguyensinh tunguyensinh added the type:bug Something is broken or incorrect label Nov 18, 2024
@fivetran-catfritz
Copy link
Contributor

Hi @tunguyensinh thanks for opening this issue! It is an interesting one. I think your first suggestion

Add the field {is_rename: True} for pass-through columns in get_*_columns.sql

might be the best path since fivetran_utils.add_pass_through_columns is meant for all our packages.

I saw you checked you were willing to open a PR. If that's still the case, for now you can create a fork of this repo, make your changes, and then open a PR against this repo with your changes. We'll let you know what else we need from there, or let us know if you have any other questions!

@fivetran-catfritz
Copy link
Contributor

Hi @tunguyensinh friendly bump--just let us know if you have any questions on opening the PR!

@fivetran-catfritz
Copy link
Contributor

Hi @tunguyensinh I worked on this myself a little, and I believe I have a working test branch. You can install it in place of the salesforce package in your packages.yml:

- git: https://github.com/fivetran/dbt_salesforce_source.git
  revision: test/passthrough-bug
  warn-unpinned: false

I used the below var for the way I set up the changes. I ended up working around the renaming.

vars:
  salesforce__lead_pass_through_columns: # pass columns from lead source
    - name: "Email"
      alias: "internal_email"
      transform_sql: "case when email like '%domain%' then true else false end"

If you're able to try it out, let me know how this works for you!

@fivetran-catfritz fivetran-catfritz self-assigned this Dec 13, 2024
@fivetran-catfritz
Copy link
Contributor

Hi @tunguyensinh just checking in if you had a chance to take a look at this!

@fivetran-catfritz
Copy link
Contributor

fivetran-catfritz commented Dec 16, 2024

Hi @tunguyensinh,

While validating my test branch, I realized the issue was related to the formatting of the passthrough variable. The formatting below is what resolved the issue for my test branch, rather than any changes I made.

Here’s the formatting that worked:

vars:  
  salesforce__lead_pass_through_columns: # pass columns from lead source  
    - name: "Email"  
      alias: "internal_email"  
      transform_sql: "case when email like '%domain%' then true else false end"  

I’ll also make some updates to the documentation to clarify this behavior.

I’ll mark this issue as won’t fix since no code updates are needed, but I’ll keep the issue open so others can reference it if they encounter something similar. Please feel free to ping us here or tag me directly if you have any further questions. Thanks again for bringing this to our attention!

@fivetran-catfritz fivetran-catfritz added type:wontfix This will not be worked on and removed type:bug Something is broken or incorrect labels Dec 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:wontfix This will not be worked on
Projects
None yet
Development

No branches or pull requests

2 participants