-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[C++] Add arrow::ArrayStatistics #41909
Comments
I think this is a great idea. Adding This can make Arrow easier to use, especially for people working with data sources like Parquet that have statistics. |
The mailing list discussion: https://lists.apache.org/thread/kcpyq9npnh346pw90ljwbg0wxq6hwxxh |
See apacheGH-42133 how to use this for Apache Parquet statistics.
See apacheGH-42133 how to use this for Apache Parquet statistics.
### Rationale for this change We're discussion API on the mailing list https://lists.apache.org/thread/kcpyq9npnh346pw90ljwbg0wxq6hwxxh and GH-41909. If we have `arrow::ArrayStatistics`, we can attach statistics read from Apache Parquet to `arrow::Array`s. This only includes `arrow::ArrayStatistics`. See GH-42133 how to use `arrow::ArrayStatitics` for Apache Parquet's statistics. ### What changes are included in this PR? This only adds `arrow::ArrayStatistics` and its tests. ### Are these changes tested? Yes. ### Are there any user-facing changes? Yes. * GitHub Issue: #41909 Authored-by: Sutou Kouhei <[email protected]> Signed-off-by: Sutou Kouhei <[email protected]>
Issue resolved by pull request 43273 |
Describe the enhancement requested
An Arrow array doesn't have statistics but Arrow array source such as Parquet column may have statistics.
We can get the source statistics via source reader such as
parquet::ColumnChunkMetaData::statistics()
(parquet::ParquetFileReader::metadata()->RowGroup(X)->ColumnChunk(Y)->statistics()
) but can't get read Arrow array (e.g.parquet::arrow::FileReader::ReadColumn()
).How about adding
arrow::ArrayStatistics
or something and attaching source statistics toarrow::Array
?Component(s)
C++
The text was updated successfully, but these errors were encountered: