-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test: show stats in explain of two representative queries #8173
Conversation
GROUP BY t1.c1 | ||
HAVING sum(t2.c4) > 1 | ||
ORDER BY t1.c1 ASC | ||
LIMIT 10; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alamb : At first I thought I would add many small queries and on single data type but it turns out, if I do so, I have to add quite many and also many combinations for different data types. Still, they do not cover the propagation I want to see. Thus, I have to add complicated queries plus those small queries.
To avoid that, I decided to go with 2 quite representative queries, one on CVS file and one on parquet file. Each I have different combinations of filters on different data types and includes common standard SQL clauses (select, from, where, group by, having, order by, limit). They not only show the statistics for each operator but also how they are propagated upward.
Let me know what you think. I am happy to add small queries, too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this makes sense and is a great first step.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There appears to be a CI failure on this PR
GROUP BY t1.c1 | ||
HAVING sum(t2.c4) > 1 | ||
ORDER BY t1.c1 ASC | ||
LIMIT 10; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this makes sense and is a great first step.
marking as draft to show this PR isn't waiting for review |
@alamb : It seems there are improvements in statistics recently. Do you think we still need this kind of tests? |
I think the tests in this PR add value as they function as an end to end test of statistics calculation and propagation -- perhaps we can move them to a different |
Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or this will be closed in 7 days. |
Which issue does this PR close?
Tests for #8155
Rationale for this change
I have found that statistics were lost being propagated upward in the plan. These are tests that include representative operators for us to verify whether the statistics are computed and propagated up correctly
What changes are included in this PR?
Add 2 explains
Are these changes tested?
They are tests only
Are there any user-facing changes?
No