Skip to content

fix: support within_group #16538

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

chenkovsky
Copy link
Contributor

@chenkovsky chenkovsky commented Jun 24, 2025

Which issue does this PR close?

Rationale for this change

WITHIN GROUP clause gets ignored for non ordered aggregate function

What changes are included in this PR?

reject WITHIN GROUP clause. I checked Postgres, it also doesn't support such query. if anyone thinks that this query should be supported. I can modify thir PR to support it.

Are these changes tested?

UT

Are there any user-facing changes?

No

@github-actions github-actions bot added sql SQL Planner sqllogictest SQL Logic Tests (.slt) labels Jun 24, 2025
Comment on lines 7045 to 7046
SELECT array_agg(DISTINCT a_varchar) within group (order by a_varchar)
FROM (VALUES ('a'), ('d'), ('c'), ('a')) t(a_varchar);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no reason for this not to work. Equivalent syntax is already supported:

DataFusion CLI v48.0.0
> SELECT array_agg(DISTINCT a_varchar order by a_varchar)
FROM (VALUES ('a'), ('d'), ('c'), ('a')) t(a_varchar);
+-----------------------------------------------------------------------+
| array_agg(DISTINCT t.a_varchar) ORDER BY [t.a_varchar ASC NULLS LAST] |
+-----------------------------------------------------------------------+
| [a, c, d]                                                             |
+-----------------------------------------------------------------------+

... and not only in DataFusion

trino> SELECT array_agg(DISTINCT a_varchar order by a_varchar)
    -> FROM (VALUES ('a'), ('d'), ('c'), ('a')) t(a_varchar);;
   _col0
-----------
 [a, c, d]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated this PR to support this feature.

@@ -375,6 +375,10 @@ impl<S: ContextProvider> SqlToRel<'_, S> {
return plan_err!("WITHIN GROUP clause is required when calling ordered set aggregate function({})", fm.name());
}

if !fm.is_ordered_set_aggregate() && !within_group.is_empty() {
return plan_err!("WITHIN GROUP clause is not permitted for non-ordered set aggregate function({})", fm.name());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

array_agg is definitely ordered, or order-sensitive

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think order-sensitive means order is necessary. but for array_agg, order is optional.

@findepi
Copy link
Member

findepi commented Jun 25, 2025

Closes #16515.

Can you please add both test queries from the issue description? Thanks!

@chenkovsky chenkovsky changed the title fix: reject within_group for non ordered aggregate function fix: support within_group Jun 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sql SQL Planner sqllogictest SQL Logic Tests (.slt)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

PostgreSQL dialect's WITHIN GROUP clause gets ignored
2 participants