Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hive vs Iceberg timestamps in unit tests #653

Open
2 tasks done
valerio-auricchio opened this issue May 20, 2024 · 6 comments
Open
2 tasks done

Hive vs Iceberg timestamps in unit tests #653

valerio-auricchio opened this issue May 20, 2024 · 6 comments
Labels
bug Something isn't working

Comments

@valerio-auricchio
Copy link

Is this a new bug in dbt-athena?

  • I believe this is a new bug in dbt-athena
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

I’m spiking dbt-athena but I’ve bumped into another issue.
At the moment, I am building all models as Iceberg tables. As a result, all of the timestamp fields are set as timestamp(6). All good so far, and I can dbt run to build all of the models without a problem.
However, when I run the unit tests, I am getting timestamp errors:
Incorrect timestamp precision for timestamp(6); the configured precision is MILLISECONDS; column name:"column_name.

The issue is that when it creates the intermediate tables for testing, it uses "hive" tables. However, in our case, we need "iceberg" tables.

When we launch the tests, it use this SQL query, and use the table_type='hive':

create table "datacatalog"."schema"."table_dbt_tmp"

    with (

      table_type='**hive**',

      is_external=true,external_location='s3://.....',

      format='parquet'

    )
....

Expected Behavior

We expect to have table_type='iceberg' instead of hive in this way:

create table "datacatalog"."schema"."table_dbt_tmp"

    with (

      table_type='**iceberg**',

      is_external=true,external_location='s3://.....',

      format='parquet'

    )
....

Steps To Reproduce

Here are the steps to reproduce our issue.
We are using Athena and materializing the tables in Iceberg, try to run an unit test in a model that is materialiezed iceberg and has column_type "timestamp(6)".

Environment

- dbt: 1.8.0
- dbt-athena-community: 1.8.1

Additional Context

No response

@valerio-auricchio valerio-auricchio added the bug Something isn't working label May 20, 2024
@Jrmyy
Copy link
Contributor

Jrmyy commented May 20, 2024

👋🏻 Hello

Thanks for opening the issue. Following a discussion on Slack this morning(https://getdbt.slack.com/archives/C013MLFR7BQ/p1716193690918929?thread_ts=1715843843.181679&cid=C013MLFR7BQ), I think the issue is more precise than that. It happens only when storing failure tests. Because when this option is set up (store_failures set to true), it creates a table but we don't get to configure the table type.

One idea would be to add an additional parameter store_failures_table_config as a dictionary, override the dbt initial macro so that we can support this.

@nicor88
Copy link
Contributor

nicor88 commented May 21, 2024

@Jrmyy is there a macro called by dbt-core to overwrite? in that case we can overwrite that instead.

@Jrmyy
Copy link
Contributor

Jrmyy commented May 21, 2024

@nicor88 there is no macro unfortunately, it is directly defined in test materialization: https://github.com/dbt-labs/dbt-adapters/blob/main/dbt/include/global_project/macros/materializations/tests/test.sql#L5

@nicor88
Copy link
Contributor

nicor88 commented May 21, 2024

@Jrmyy what you suggested works for me, we can let the user pass a store_failures_table_config in order to persist what the need. Also I was wondering if we could have overwrite small macros pieces, instead of the all thing. But if we have no choice we can overwrite the all test macro, and eventually "raise" an issue in dbt-adapters/dbt-core

@borjagonzal
Copy link

Hi, how could we help to make progress with this issue. It would be great to to be able to use unit tests directly for our models producing Iceberg tables.

@nicor88
Copy link
Contributor

nicor88 commented Aug 27, 2024

@colin-rogers-dbt do you have some hints regarding this issue? What will be the best way to configure table properties of the table that is being used when storing failures? what proposed by @Jrmyy here should work, but I would like to understand if there is an easier way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants