-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Fix parquet complex type handling (#9187)
Summary: Fixes #7776 Parquet has notion of optional and repeated layers which is needed in arrow calls like [DefLevelsToBitmap](https://github.com/facebookincubator/velox/blob/7fc09667d5e22c684fdeff81da529b79cc974fee/velox/dwio/parquet/reader/PageReader.cpp#L573). This info is passed using arrow:LevelInfo. We were incorrectly computing **repeatedAncestor** by ignoring optional fields which is fixed in this PR. Parquet has 3 level structure for nested types https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#lists ``` // List<String> (list non-null, elements nullable) 1. required group my_list (LIST) { 2. repeated group list { 3. optional binary element (UTF8); } } ``` However when we read this and convert to **ParquetTypeWithId** in current velox parquet reader, we ignore the intermediated layer 2. **repeated group list** (grandfather logic) in https://github.com/facebookincubator/velox/pull/9187/files#diff-64787e76c1b0ad12b5764770a94acd62054896a762ccead8f083a71a060f2f44R325. Pull Request resolved: #9187 Reviewed By: mbasmanova Differential Revision: D55975472 Pulled By: Yuhta fbshipit-source-id: d0972b3134cc710645a9f50cd74a23efac830751
- Loading branch information
1 parent
a1706c3
commit 37f4700
Showing
5 changed files
with
95 additions
and
30 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters