Equality test calculation doesn't make sense #836

foundinblank · 2023-09-14T08:34:20Z

Describe the bug

This is to discuss the logic behind the existing calculation in the equality test and propose a more straightforward calculation.

Currently the calculation is:

dbt-utils/macros/generic_tests/equality.sql

Lines 8 to 11 in 6ba7b66

    
           count(*) + coalesce(abs( 
        
               sum(case when which_diff = 'a_minus_b' then 1 else 0 end) - 
        
               sum(case when which_diff = 'b_minus_a' then 1 else 0 end) 
        
           ), 0)

This has the effect of returning unusual numbers in the console when a_minus_b vs b_minus_a are different numbers.

Example:

Failure table contains 1,000 rows total
Failure table contains 250 rows of a_minus_b
Failure table contains 750 rows of b_minus_a
My expectation is the output in the console would report [FAIL 1000] because the failure table has 1,000 rows
However, the actual output is [FAIL 1500] because
1. count(*) = 1,000
2. abs(250-750) = -500
3. total = 1,500

As a dbt user, if I see 1,500 failures, I expect to find 1,500 failing rows, and it's very confusing when I investigate and find 1,000 failing rows instead.

This calculation seems to have been part of the equality test since it was first committed in 2017 by @jthandy. Would anyone know why the calculation is set up in this way, and if we would be fine with just a count(*) instead? Happy to make the PR for it after some discussion! If there is a good reason for keeping the calculation as-is, I'm happy to write additional comments in the macro file for other users who may be confused too 😄

Are you interested in contributing the fix?

Yes!

The text was updated successfully, but these errors were encountered:

github-actions · 2024-03-13T01:44:15Z

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days.

foundinblank · 2024-03-13T13:51:17Z

Let's keep this open

github-actions · 2024-09-10T01:55:47Z

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days.

github-actions · 2024-09-18T01:55:39Z

Although we are closing this issue as stale, it's not gone forever. Issues can be reopened if there is renewed community interest. Just add a comment to notify the maintainers.

foundinblank added bug Something isn't working triage labels Sep 14, 2023

github-actions bot added the Stale label Mar 13, 2024

github-actions bot removed the Stale label Mar 14, 2024

github-actions bot added the Stale label Sep 10, 2024

github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Sep 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Equality test calculation doesn't make sense #836

Equality test calculation doesn't make sense #836

foundinblank commented Sep 14, 2023

github-actions bot commented Mar 13, 2024

foundinblank commented Mar 13, 2024

github-actions bot commented Sep 10, 2024

github-actions bot commented Sep 18, 2024

Equality test calculation doesn't make sense #836

Equality test calculation doesn't make sense #836

Comments

foundinblank commented Sep 14, 2023

Describe the bug

Are you interested in contributing the fix?

github-actions bot commented Mar 13, 2024

foundinblank commented Mar 13, 2024

github-actions bot commented Sep 10, 2024

github-actions bot commented Sep 18, 2024