Skip to content

Conversation

chenkovsky
Copy link
Contributor

Which issue does this PR close?

Rationale for this change

support shuffle udf

What changes are included in this PR?

support shuffle udf

Are these changes tested?

UT

Are there any user-facing changes?

No

@github-actions github-actions bot added sqllogictest SQL Logic Tests (.slt) spark labels Sep 19, 2025
impl SparkShuffle {
pub fn new() -> Self {
Self {
signature: Signature::any(1, Volatility::Volatile),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
signature: Signature::any(1, Volatility::Volatile),
signature: Signature::arrays(1, None, Volatility::Volatile),

Example:

impl ArrayEmpty {
pub fn new() -> Self {
Self {
signature: Signature::arrays(1, None, Volatility::Immutable),
aliases: vec!["array_empty".to_string(), "list_empty".to_string()],
}
}
}

(using arrays() instead of array() to avoid the coercion from FixedSizeList to List)

Although, looking at the Spark doc it says it accepts an optional seed argument; do we need to include that here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if we implement with a seed argument we can have deterministic tests for shuffle, without running it through sort or relying on the shuffled permutation not being equal to the sorted version

}
}

fn general_array_shuffle<O: OffsetSizeTrait + TryFrom<i64>>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fn general_array_shuffle<O: OffsetSizeTrait + TryFrom<i64>>(
fn general_array_shuffle<O: OffsetSizeTrait>(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
spark sqllogictest SQL Logic Tests (.slt)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants