Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: show stats in explain of two representative queries #8173

Closed
wants to merge 1 commit into from

Conversation

NGA-TRAN
Copy link
Contributor

Which issue does this PR close?

Tests for #8155

Rationale for this change

I have found that statistics were lost being propagated upward in the plan. These are tests that include representative operators for us to verify whether the statistics are computed and propagated up correctly

What changes are included in this PR?

Add 2 explains

Are these changes tested?

They are tests only

Are there any user-facing changes?

No

@github-actions github-actions bot added the sqllogictest SQL Logic Tests (.slt) label Nov 14, 2023
GROUP BY t1.c1
HAVING sum(t2.c4) > 1
ORDER BY t1.c1 ASC
LIMIT 10;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alamb : At first I thought I would add many small queries and on single data type but it turns out, if I do so, I have to add quite many and also many combinations for different data types. Still, they do not cover the propagation I want to see. Thus, I have to add complicated queries plus those small queries.

To avoid that, I decided to go with 2 quite representative queries, one on CVS file and one on parquet file. Each I have different combinations of filters on different data types and includes common standard SQL clauses (select, from, where, group by, having, order by, limit). They not only show the statistics for each operator but also how they are propagated upward.

Let me know what you think. I am happy to add small queries, too

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this makes sense and is a great first step.

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There appears to be a CI failure on this PR

GROUP BY t1.c1
HAVING sum(t2.c4) > 1
ORDER BY t1.c1 ASC
LIMIT 10;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this makes sense and is a great first step.

@alamb alamb marked this pull request as draft December 17, 2023 12:15
@alamb
Copy link
Contributor

alamb commented Dec 17, 2023

marking as draft to show this PR isn't waiting for review

@NGA-TRAN
Copy link
Contributor Author

@alamb : It seems there are improvements in statistics recently. Do you think we still need this kind of tests?

@alamb
Copy link
Contributor

alamb commented Mar 22, 2024

I think the tests in this PR add value as they function as an end to end test of statistics calculation and propagation -- perhaps we can move them to a different .slt file (statistics.slt perhaps)?

Copy link

Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or this will be closed in 7 days.

@github-actions github-actions bot added the Stale PR has not had any activity for some time label May 22, 2024
@github-actions github-actions bot closed this May 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sqllogictest SQL Logic Tests (.slt) Stale PR has not had any activity for some time
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants