Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C++][Parquet] allow customized buffer size when creating ArrowInputStream for a column PageReader #43006

Closed
asfimport opened this issue Jul 6, 2023 · 1 comment

Comments

@asfimport
Copy link
Collaborator

When buffered stream is enabled, all column chunks, regardless of their actual sizes, are currently sharing the same buffer size which is stored in the shared [read properties](https://github.com/apache/arrow/blob/main/cpp/src/parquet/file_reader.cc#L213).  

Given a limited memory budget, one may want to customize buffer size for different column chunks based on their actual size, i.e., smaller chunks will use consume less memory budget for its buffer.

Reporter: Jinpeng Zhou / @jp0317

PRs and other links:

Note: This issue was originally created as PARQUET-2321. Please see the migration documentation for further details.

@asfimport
Copy link
Collaborator Author

Jinpeng Zhou / @jp0317:
 I think we can close this one for now as it may not worth making all these changes for a certain scenario. I'll revisit this if it became more favorable. Thanks for all the comments and reviews.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant