You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
sum(case when which_diff ='a_minus_b' then 1 else 0 end) -
sum(case when which_diff ='b_minus_a' then 1 else 0 end)
), 0)
This has the effect of returning unusual numbers in the console when a_minus_b vs b_minus_a are different numbers.
Example:
Failure table contains 1,000 rows total
Failure table contains 250 rows of a_minus_b
Failure table contains 750 rows of b_minus_a
My expectation is the output in the console would report [FAIL 1000] because the failure table has 1,000 rows
However, the actual output is [FAIL 1500] because
count(*) = 1,000
abs(250-750) = -500
total = 1,500
As a dbt user, if I see 1,500 failures, I expect to find 1,500 failing rows, and it's very confusing when I investigate and find 1,000 failing rows instead.
This calculation seems to have been part of the equality test since it was first committed in 2017 by @jthandy. Would anyone know why the calculation is set up in this way, and if we would be fine with just a count(*) instead? Happy to make the PR for it after some discussion! If there is a good reason for keeping the calculation as-is, I'm happy to write additional comments in the macro file for other users who may be confused too 😄
Are you interested in contributing the fix?
Yes!
The text was updated successfully, but these errors were encountered:
This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days.
This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days.
Although we are closing this issue as stale, it's not gone forever. Issues can be reopened if there is renewed community interest. Just add a comment to notify the maintainers.
Describe the bug
This is to discuss the logic behind the existing calculation in the
equality
test and propose a more straightforward calculation.Currently the calculation is:
dbt-utils/macros/generic_tests/equality.sql
Lines 8 to 11 in 6ba7b66
This has the effect of returning unusual numbers in the console when
a_minus_b
vsb_minus_a
are different numbers.Example:
a_minus_b
b_minus_a
[FAIL 1000]
because the failure table has 1,000 rows[FAIL 1500]
becauseAs a dbt user, if I see 1,500 failures, I expect to find 1,500 failing rows, and it's very confusing when I investigate and find 1,000 failing rows instead.
This calculation seems to have been part of the equality test since it was first committed in 2017 by @jthandy. Would anyone know why the calculation is set up in this way, and if we would be fine with just a
count(*)
instead? Happy to make the PR for it after some discussion! If there is a good reason for keeping the calculation as-is, I'm happy to write additional comments in the macro file for other users who may be confused too 😄Are you interested in contributing the fix?
Yes!
The text was updated successfully, but these errors were encountered: