Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Redshift #60

Open
amychen1776 opened this issue Jan 11, 2022 · 6 comments
Open

Support Redshift #60

amychen1776 opened this issue Jan 11, 2022 · 6 comments
Labels
enhancement New feature or request

Comments

@amychen1776
Copy link

It would be epic to have this package support Redshift. One of the things we will need to discuss is how we will recommend loading into Redshift (or S3 would be better for Redshift)

@NiallRees
Copy link
Contributor

NiallRees commented Jan 11, 2022

My initial thoughts are that we could extend the adapter for each warehouse to add an upload function which is accessible through a
{% do adapter.upload_file(file_path, destination) %}
or similar method in a macro/run-operation. That function can then just call any API methods needed to load the data into the warehouse.

@NiallRees
Copy link
Contributor

NiallRees commented Jul 21, 2022

We've been busy reimplementing the package, opening it up to be compatible with more adapters. Resolving this involves implementing a redshift version of each warehouse dispatch-able macro defined in https://github.com/brooklyn-data/dbt_artifacts/tree/main/macros.

@jecolvin jecolvin added the enhancement New feature or request label Aug 2, 2022
@ataft
Copy link

ataft commented Oct 5, 2022

The Redshift macro part is relatively easy to implement. The problem that I run into is that the type_string() logic in the source models (e.g. https://github.com/brooklyn-data/dbt_artifacts/blob/main/models/sources/models.sql#L7), which translates to cast(null as TEXT) in Redshift, is creating varchar(1) data types in the materialized tables in Redshift. This makes the insert statements fail.

Any thoughts on a solution? E.g. change null to a dummy string, specify a static length, allow the models to have different logic based on adapter?

Here are some quick tests using the create table logic that dbt writes:

create table "dev"."dbt_bi_artifacts"."model_executions"
as (
    with dummy_cte as (
        select 1 as foo
    )
    select
        cast('12345678' as VARCHAR) as col1,-- creates varchar(8)
        cast(null as VARCHAR) as col2,      -- creates varchar(1)
        cast('12345678' as TEXT) as col3,   -- creates varchar(8)
        cast(null as TEXT) as col4,         -- creates varchar(1)
        cast(null as varchar(32)) as col5   -- creates varchar(32)
    from dummy_cte
    where 1 = 0
);

@kylezeeuwen
Copy link

kylezeeuwen commented Jan 31, 2024

Field report on @brendan-cook-87 's fork. Works great for me !

Self hosted DBT

DBT + redshift version:

23:26:10  Running with dbt=1.7.4
23:26:11  Registered adapter: redshift=1.7.1

packages.yml:

packages:
  - package: dbt-labs/dbt_utils
    version: 1.1.1
  - package: dbt-labs/audit_helper
    version: 0.9.0
  - package: dbt-labs/codegen
    version: 0.12.1
  - package: calogica/dbt_expectations
    version: 0.10.1
  - git: "https://github.com/brendan-cook-87/dbt_artifacts.git"
    revision: "support-redshift"

tested seed, run, snapshot. No issues. Only tyre kicked the data in the tables but looks correct.

Thanks for the code !!!

@FredericoCoelhoNunes
Copy link

Any updates on this topic? Is there a particular reason why Redshift is not supported?

@hanslemm
Copy link

hanslemm commented Oct 21, 2024

Created an updated version of @brendan-cook-87 work here.

It works as expected for Redshift, so if you need support for Redshift, just tap this fork while the PR is not merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

7 participants