Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: Add a test for RowFilter with nested type #5600

Merged
merged 1 commit into from
Apr 9, 2024

Conversation

viirya
Copy link
Member

@viirya viirya commented Apr 7, 2024

Which issue does this PR close?

Closes #.

Rationale for this change

While working on apache/iceberg-rust#295, I'm confused by the nested column behavior of RowFilter. So writing the test to clarify its usage.

What changes are included in this PR?

Are there any user-facing changes?

Comment on lines +1908 to +1914
// Filter on the second element of the struct.
let struct_array = batch
.column(0)
.as_any()
.downcast_ref::<StructArray>()
.unwrap();
eq(struct_array.column(0), &Scalar::new(&b_scalar))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw, the row filter needs to know what the schema is so it can get correct (nested) column to do filtering. For the general filter implementation like apache/iceberg-rust#295 proposes to be, is any utility we can use to "flatten" nested columns from the batch?

In other words, is any existing way to flatten projected (nested) columns in the batch? So if we know a leaf column's index, we can know its position in projection mask and the flatten batch. Then we can simply get the column by flatten_batch.column.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the challenge here is when the nested arrays are either repeated or nullable, in such a case trying to interpret the leaves in isolation isn't necessarily meaningful

@tustvold tustvold merged commit f38283b into apache:master Apr 9, 2024
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
parquet Changes to the parquet crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants