-
Notifications
You must be signed in to change notification settings - Fork 824
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Accept parquet schemas without explicitly required Map keys #5630
Accept parquet schemas without explicitly required Map keys #5630
Conversation
Can we add a test for this, I suspect the dremel shredding will not work correctly with just this change. Perhaps you could get such a file added to the parquet-testing repo so that an integration test can be written |
Ensuring that the key will always be non-nullable in Arrow, and added a test in the module, following the same pattern as other schema conversions. I can can provide a test file, but not sure how integration tests are written, and I'm not seeing any reference to the test files, other than in the Readme. Should I just add the file 'parquet-testing/data' and add an entry in the Readme? |
https://github.com/apache/parquet-testing/pull/45/files would be a good example |
Also see apache/parquet-testing#47 (comment) |
I would like to see if we can get apache/parquet-testing#47 in first so that we can get a full end-to-end test here. If things drag out though we can move ahead as is |
apache/parquet-testing#47 was merged last week, so I think it should just be a case of updating this PR with an integration test based off that |
How does that look? I've added a test in what I think is an appropriate place, with similar tests. Let me know if there's something specific we should add to the test. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You'll need to bump the relevant submodule too in order to get that new test file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think just need to fix clippy then should be good to go?
Thank you for sticking with this 👍 |
Which issue does this PR close?
Closes #5606.
Rationale for this change
The check is superfluous and restricts reading of a huge number of files produced without Map keys explicitly marked as required.
What changes are included in this PR?
Reduces the check to only error on a value that would have to have been explicitly set to an invalid value.
Are there any user-facing changes?
None found