Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-51114] [SQL] Refactor PullOutNondeterministic rule #49837

Closed

Conversation

mihailoale-db
Copy link
Contributor

@mihailoale-db mihailoale-db commented Feb 6, 2025

What changes were proposed in this pull request?

Refactor PullOutNondeterministic rule body so it can be reused in the single-pass analyzer.

Why are the changes needed?

Better reusability of the PullOutNondeterministic rule.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Existing tests (just refactoring).

Was this patch authored or co-authored using generative AI tooling?

No.

@github-actions github-actions bot added the SQL label Feb 6, 2025
@mihailoale-db mihailoale-db force-pushed the nondeterministicrefactor branch 2 times, most recently from cccf3a3 to e08b60b Compare February 6, 2025 20:47
@vladimirg-db
Copy link
Contributor

Please specify that this is for single-pass Analyzer in the PR description.

@mihailoale-db mihailoale-db force-pushed the nondeterministicrefactor branch from e08b60b to ddd9b52 Compare February 10, 2025 10:30
@mihailoale-db
Copy link
Contributor Author

Hi @MaxGekk could you please review when you have time? Thanks

nondeterministicToAttributes
}

def tryConvertNondeterministicToAttribute(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels too trivial to be reused. Please use getOrDefault in-place.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getOrDefault can't be used here as default value is Expression type but getOrDefault expects a NamedExpression. I would leave it like it is

val newChild = Project(a.child.output ++ nondeterToAttr.values, a.child)
val nondeterToAttr =
NondeterministicExpressionCollection.getNondeterministicToAttributes(a.groupingExpressions)
val newChild = Project(a.child.output ++ nondeterToAttr.values.asScala.toSeq, a.child)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a separate issue (not related to this refactoring), but we have a non-deterministic Project list order here. We should do a separate PR with a transition to a LinkedHashMap.

Kinda similar to #49319

@mihailom-db @cloud-fan

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah makes sense. Will make a followup after this one is merged.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@mihailoale-db mihailoale-db force-pushed the nondeterministicrefactor branch from ddd9b52 to 460f8e0 Compare February 10, 2025 16:37
@mihailoale-db mihailoale-db force-pushed the nondeterministicrefactor branch from 460f8e0 to 62c97dd Compare February 11, 2025 13:04
}
e -> ne
}
}.toMap
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we also keep the previous code of creating an immutable map?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@mihailoale-db mihailoale-db force-pushed the nondeterministicrefactor branch from 62c97dd to b200cf9 Compare February 11, 2025 13:30
@the-sakthi
Copy link
Member

LGTM

@cloud-fan
Copy link
Contributor

The AQE test failure is unrelated, thanks, merging to master!

@cloud-fan cloud-fan closed this in f52a679 Feb 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants