Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to disable statistics in version 1.13.1? #3103

Open
felipepessoto opened this issue Dec 17, 2024 · 3 comments
Open

How to disable statistics in version 1.13.1? #3103

felipepessoto opened this issue Dec 17, 2024 · 3 comments

Comments

@felipepessoto
Copy link

Describe the usage question you have. Please include as many useful details as possible.

I found these two PRs to disable statistics, but they are available only in 1.15+ 
#2989
#3056

Is there any other way to disable statistics in 1.13.1?

Component(s)

Core

@wgtmac
Copy link
Member

wgtmac commented Dec 18, 2024

Thanks for reporting this! You're right, those mentioned commits are not ported to legacy branches. Could you use the version 1.15.0?

@felipepessoto
Copy link
Author

I can't because I'm using Spark 3.5.

I was wondering if possible to somehow disable stats in 1.13.1, even if it is more complex than just setting a flag.

@wgtmac
Copy link
Member

wgtmac commented Dec 18, 2024

I agree with you that it seems impossible to disable stats in the past. Therefore I made those PRs to fix them. Perhaps you may want to set parquet.statistics.truncate.length and parquet.columnindex.truncate.length to 0 to disable stats of BYTE_ARRAY type. These flags can be found at https://github.com/apache/parquet-java/tree/master/parquet-hadoop

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants