Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

opt: add rule to decorrelate union operators #131141

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

DrewKimball
Copy link
Collaborator

opt: improve EnsureKey custom function

This commit makes a small improvement to the EnsureKey custom function
(used in decorrelation rules), so that it can add passthrough columns to
a Project operator in an effort to find a key. This can prevent rules
from adding unnecessary Ordinality operators to the query plan.

Epic: None

Release note: None

opt: improve exists subquery hoisting

This commit makes a small improvement to the subquery-hoisting rules so
that hoisting an EXISTS subquery can often avoid projecting a new
column to check for NULL values. This can allow other optimization rules
to match later on.

Epic: None

Release note: None

opt: add rule to decorrelate unions in EXISTS subqueries

This commit adds a new rule TryDecorrelateUnion, which matches on a
Union or UnionAll operator in the input of a ScalarGroupBy. The
ScalarGroupBy must have "any-not-null" semantics, meaning it produces
an arbitrary non-null value from each input column.

If these conditions are satisfied, the Union operator is replaced by an
InnerJoin between two ScalarGroupBy operators. A Project coalesces
columns from each side of the join to produce the final aggregated values.

This transformation does not itself decorrelate the Union operators, but
it does make it easier for other rules to do so.

Release note: None

Epic: None

This commit makes a small improvement to the `EnsureKey` custom function
(used in decorrelation rules), so that it can add passthrough columns to
a `Project` operator in an effort to find a key. This can prevent rules
from adding unnecessary `Ordinality` operators to the query plan.

Epic: None

Release note: None
This commit makes a small improvement to the subquery-hoisting rules so
that hoisting an `EXISTS` subquery can often avoid projecting a new
column to check for NULL values. This can allow other optimization rules
to match later on.

Epic: None

Release note: None
This commit adds a new rule `TryDecorrelateUnion`, which matches on a
`Union` or `UnionAll` operator in the input of a `ScalarGroupBy`. The
`ScalarGroupBy` must have "any-not-null" semantics, meaning it produces
an arbitrary non-null value from each input column.

If these conditions are satisfied, the `Union` operator is replaced by an
`InnerJoin` between two `ScalarGroupBy` operators. A `Project` coalesces
columns from each side of the join to produce the final aggregated values.

This transformation does not itself decorrelate the `Union` operators, but
it does make it easier for other rules to do so.

Release note: None

Epic: None
@DrewKimball DrewKimball requested review from mgartner and a team September 21, 2024 08:04
@DrewKimball DrewKimball requested a review from a team as a code owner September 21, 2024 08:04
Copy link

blathers-crl bot commented Sep 21, 2024

Your pull request contains more than 1000 changes. It is strongly encouraged to split big PRs into smaller chunks.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants