-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Enhancement]parquet reader supports low cardinality optimization #55167
base: main
Are you sure you want to change the base?
Conversation
8ff2748
to
a834e45
Compare
Signed-off-by: zombee0 <[email protected]>
a834e45
to
4227e55
Compare
|
||
if (_tmp_column == nullptr) { | ||
_tmp_column = ColumnHelper::create_column(TypeDescriptor::from_logical_type(TYPE_VARCHAR), true); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use TYPE_VARCHAR_DESC
instead of TypeDescriptor::from_logical_type(TYPE_VARCHAR)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will do it.
nullable_dst->set_has_null(nullable_codes->has_null()); | ||
} | ||
|
||
src->reset_column(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why reset_column?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
src is used as temp column to store intermediate result, fill_dst_column means we have got a chunk of data, next time we call get_next, we will append data to it.
array_column_dst = down_cast<ArrayColumn*>(nullable_column_dst->mutable_data_column()); | ||
NullColumn* null_column_dst = nullable_column_dst->mutable_null_column(); | ||
null_column_dst->swap_column(*null_column_src); | ||
nullable_column_src->update_has_null(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not directly swap the flag of src_nullable_column and dst_nullable_column?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for data column, we fill it with reader
array_column_src = down_cast<ArrayColumn*>(src.get()); | ||
array_column_dst = down_cast<ArrayColumn*>(dst.get()); | ||
} | ||
array_column_dst->offsets_column()->swap_column(*(array_column_src->offsets_column())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use ArrayColumn::swap_column?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we only swap offset and nullcolumn, for element_column, we fill it with element_reader
Signed-off-by: zombee0 <[email protected]>
[Java-Extensions Incremental Coverage Report]✅ pass : 0 / 0 (0%) |
[FE Incremental Coverage Report]✅ pass : 0 / 0 (0%) |
[BE Incremental Coverage Report]✅ pass : 175 / 195 (89.74%) file detail
|
Why I'm doing:
What I'm doing:
Fixes #issue
What type of PR is this:
Does this PR entail a change in behavior?
If yes, please specify the type of change:
Checklist:
Bugfix cherry-pick branch check: