Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug in macro incremental_delete when using composite keys for unique_key #143

Closed
lpezet opened this issue Apr 2, 2023 · 5 comments · Fixed by #144
Closed

Bug in macro incremental_delete when using composite keys for unique_key #143

lpezet opened this issue Apr 2, 2023 · 5 comments · Fixed by #144
Labels
bug Something isn't working

Comments

@lpezet
Copy link
Contributor

lpezet commented Apr 2, 2023

Describe the bug

When using array of columns for unique_key in incremental materialization, the macro incremental_delete chokes and the following error is thrown, using MySQL 8.
I'm using code from main branch.

Example of model:

{{ config(materialized="incremental", unique_key=["actor_id", "film_id"]) }}
select * from {{ source('raw', 'seed') }}

Example of seed:

actor_id,film_id,some_date
1,1,1981-05-20T06:46:51
2,1,1978-09-03T18:10:33
3,1,1982-03-11T03:59:51

Exception thrown when running dbt run:

You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '['actor_id', 'film_id']) in (
ERROR    configured_std_out:functions.py:236 20:04:01    1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '['actor_id', 'film_id']) in (

Steps To Reproduce

git clone https://github.com/lpezet/dbt-mysql.git
git checkout fix-incremental-composite-keys
# get setup for integration tests
docker-compose up -d
PYTHONPATH=. pytest -v --profile mysql tests/functional/adapter/test_basic.py::TestIncrementalCompositeKeysMySQL::test_incremental_with_composite_keys
docker-compose down

Expected behavior

Exception not thrown and rows from seed deleted and then re-inserted without creating duplicates.

The operating system you're using:
Linux GoldenHind 5.15.90.1-microsoft-standard-WSL2 #1 SMP Fri Jan 27 02:56:13 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

The output of python --version:
Python 3.8.10

MySQL Server 8.0.29

Additional context

The branch fix-incremental-composite-keys actually contains the test to ,well, test incremental materialization with composite keys.
I couldn't find the equivalent in dbt-tests-adapter code...strangely...I must have missed something. I therefore created the class tests/functional/adapter/incremental_composite.py based on dbt-tests-adapter's very own BaseIncremental class and added a test in tests/functional/adapter/test_basic.py called TestIncrementalCompositeKeysMySQL.

The problem, I believe, is in 2 parts:

  • The use of parenthesis () in the select ( {{ unique_key }} ) to get values for the in ().
  • The use of {{ unique_key }} directly as it will resolve in "[ key1, key2, ... ]" (an array representation) when an array is specified in the model like " {{ config( ..., unique_key=[ "key1", "key2", ...] }}". This seems possible since dbt-core 1.1.

I believe the following would fix the issue:

{% macro incremental_delete(tmp_relation, target_relation, unique_key=none, statement_name="pre_main") %}
    {%- if unique_key is not none -%}
    delete
    from {{ target_relation }}
    where ({{ unique_key | join(',') }}) in (
        select {{ unique_key | join(',') }}
        from {{ tmp_relation }}
    )
    {%- endif %}

{%- endmacro %}

NB: No *GPT has been harmed, or even used!, during the creation of all this. Just lots of time spent figuring out dbt-tests-adapter and dbt-mysql, git-ing, coding, and typing this very long ticket. Just like old (?) times ;)

@lpezet lpezet added the bug Something isn't working label Apr 2, 2023
@moszutij
Copy link

Perhaps the following implementation for backwards compatibility:

{% macro incremental_delete(tmp_relation, target_relation, unique_key=none, statement_name="pre_main") %}
    {%- if unique_key is not none -%}
    delete
    from {{ target_relation }}
    where ({{ unique_key if unique_key is string else unique_key | join(',') }}) in (
        select {{ unique_key if unique_key is string else unique_key | join(',') }}
        from {{ tmp_relation }}
    )
    {%- endif %}

{%- endmacro %}

@lpezet
Copy link
Contributor Author

lpezet commented Apr 23, 2023

@moszutij Your solution is definitely better.
I ran my "solution" (let's call it a quick hack) before and all tests pass. That means even the standard dbt-tests-adapter tests pass. I'll add some tests to make sure my hack would fail those.
It is surprising those cases are not tested by dbt-tests-adapter (I'm referring to this: https://github.com/dbt-labs/dbt-tests-adapter/blob/main/dbt/tests/adapter/basic/test_incremental.py). Maybe I missed something or I'm looking at the wrong thing?

@moszutij
Copy link

After examining your solution, I decided to give it a try, but later realised that it only worked for a list of column names enclosed in single quotes, and not for a string representation. However, the documentation suggests that both formats are acceptable, hence the suggested implementation. I haven't had the chance to familiarise myself with the standard dbt-tests-adapter tests yet, but it's on my to-do list 👩‍💻. I'm also using dbt with mysql, so appreciate the pull requests.

@lpezet
Copy link
Contributor Author

lpezet commented Apr 23, 2023

@moszutij Thanks for checking it out. I really appreciate.
I actually just found those tests in dbt-core itself:
https://github.com/dbt-labs/dbt-core/blob/1.2.latest/tests/adapter/dbt/tests/adapter/incremental/test_incremental_unique_id.py
Way more thorough than what I came up with.
I'll update my branch for PR to leverage that test.

There are even more tests in 1.4.latest (shameless plug for my PR #146) when we get there.

@dbeatty10
Copy link
Owner

Thanks for your time and effort into this @lpezet and @moszutij ! 🏆

I'm out of pocket this week, but looking forward to collaborating with you next week to get these merged and released.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants