Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-32723: [C++][Parquet] Add option to use LARGE* variants of binary types #35825

Closed
wants to merge 69 commits into from
Closed
Show file tree
Hide file tree
Changes from 14 commits
Commits
Show all changes
69 commits
Select commit Hold shift + click to select a range
e5e96ec
able to read the file
arthurpassos May 26, 2023
b9b48f8
remove diff out
arthurpassos May 26, 2023
ae62954
intermediate stage, not working properly anymore..
arthurpassos May 29, 2023
34917d5
still not working
arthurpassos May 29, 2023
835b07d
able to read the file again
arthurpassos May 29, 2023
50427c6
move use_binary_large_variants to arrowreaderproperties
arthurpassos May 30, 2023
df65ce7
cleanup a bit
arthurpassos May 30, 2023
e826b8e
back fromByteArray string & binary with setting
arthurpassos May 30, 2023
764ef98
some more adjustments
arthurpassos May 30, 2023
c6244ea
revert some stuff
arthurpassos May 30, 2023
5a4bbb0
revert some stuff
arthurpassos May 30, 2023
2d84e57
improvement
arthurpassos May 30, 2023
90f14df
remove dictionary64
arthurpassos May 30, 2023
b88b024
use 64bit on largebytearray class and initialize binary_large_variant…
arthurpassos May 31, 2023
0b53b05
add chunked string map test
arthurpassos May 31, 2023
f574e2e
add boolean comment
arthurpassos May 31, 2023
295e062
Make ChunkedRecordReader generic by using templates
arthurpassos May 31, 2023
25d7815
Make ByteArrayDictionaryReader generic with the use of templates
arthurpassos May 31, 2023
fe8d67b
make arrowbinaryhelper generic
arthurpassos May 31, 2023
35e5835
Make PlainByteArrayDecoder generic
arthurpassos May 31, 2023
9aff2f3
remove use_binary_large_variant from parquet reader properties
arthurpassos Jun 1, 2023
eb850c4
removed parquet::type::large_Byte_array
arthurpassos Jun 5, 2023
c2aab63
small adjustment
arthurpassos Jun 5, 2023
837ed6c
remove largebytearray class
arthurpassos Jun 6, 2023
35cdb99
simplify largebytearraytype a bit
arthurpassos Jun 6, 2023
a5000e1
simplify dictbytearraydecoderimpl a bit
arthurpassos Jun 6, 2023
eb71c17
remove one default argument
arthurpassos Jun 6, 2023
686a3f7
remove junk code
arthurpassos Jun 6, 2023
a61fc32
move use_binary_large_variant check inside frombytearray
arthurpassos Jun 6, 2023
e2600d0
simplify chunkedrecordreader a bit
arthurpassos Jun 6, 2023
3b86e23
simplify DictionaryRecordReaderImpl and fix DebugPrintState
arthurpassos Jun 6, 2023
cc027b7
simplify PlainByteArrayDecoderBase
arthurpassos Jun 6, 2023
177db7a
remove some todos
arthurpassos Jun 7, 2023
66223ee
Add comment explaining why struct LargeByteArrayType instead of alias
arthurpassos Jun 7, 2023
5cd39d8
address some pr comments
arthurpassos Jun 8, 2023
1089010
address a few more comments
arthurpassos Jun 8, 2023
a6c42ee
remove arrow-type include & move binarylimit trait
arthurpassos Jun 8, 2023
15be2a2
consolidate setdict
arthurpassos Jun 8, 2023
8d5ba3d
apply clangformat
arthurpassos Jun 8, 2023
fd8f979
removed todos
arthurpassos Jun 9, 2023
a5736d5
a bit more renaming
arthurpassos Jun 9, 2023
b4ecd0d
address one mor comment
arthurpassos Jun 9, 2023
9e9dff9
add overflow check in dict
arthurpassos Jun 9, 2023
ae1db20
address a few comments
arthurpassos Jun 12, 2023
09a9eaf
use int32_t explicitly
arthurpassos Jun 14, 2023
1664983
use template directly
arthurpassos Jun 14, 2023
322319e
use offset_type
arthurpassos Jun 15, 2023
1775a7a
address comments
arthurpassos Jun 15, 2023
7f6e2bf
address a few minor comments
arthurpassos Jun 16, 2023
75fb615
fix DictDecoderImpl
arthurpassos Jun 16, 2023
0801267
add non overflow test
arthurpassos Jun 16, 2023
7f09a16
string test
arthurpassos Jun 19, 2023
a8d20a4
address minor comments
arthurpassos Jun 20, 2023
5fcf4e1
use raw filereaderbuilder instead of adding a new openfile function
arthurpassos Jun 21, 2023
8901cbc
rename test
arthurpassos Jun 21, 2023
dff017a
update test file name
arthurpassos Jun 21, 2023
232e01f
update submodule?
arthurpassos Jun 21, 2023
d7d76c6
aply clang-format
arthurpassos Jun 21, 2023
90ceb07
address minor comments
arthurpassos Jun 22, 2023
0394963
delta & delta length for large*
arthurpassos Jun 22, 2023
a8df2e7
fix wrong if statements
arthurpassos Jun 22, 2023
2bb3b14
Template member variable as well
arthurpassos Jun 23, 2023
c114d44
add docstring
arthurpassos Jun 23, 2023
d1d5798
add LargeStringDictionary32Builder
arthurpassos Jun 23, 2023
0eaa60f
address a few comments
arthurpassos Jun 26, 2023
1e642fa
clang format
arthurpassos Jun 26, 2023
b299497
add binarypacked test for largebinaryvariant
arthurpassos Jun 26, 2023
2c23dd7
Revert "add binarypacked test for largebinaryvariant"
arthurpassos Jun 27, 2023
eca9d6f
only run largebinary tests if system is 64bit
arthurpassos Jul 6, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion cpp/examples/parquet/parquet_arrow/reader_writer.cc
Original file line number Diff line number Diff line change
Expand Up @@ -137,4 +137,4 @@ int main(int argc, char** argv) {
read_single_rowgroup();
read_single_column();
read_single_column_chunk();
}
}
arthurpassos marked this conversation as resolved.
Show resolved Hide resolved
1 change: 1 addition & 0 deletions cpp/src/arrow/array/builder_dict.h
Original file line number Diff line number Diff line change
Expand Up @@ -724,6 +724,7 @@ using BinaryDictionaryBuilder = DictionaryBuilder<BinaryType>;
using StringDictionaryBuilder = DictionaryBuilder<StringType>;
using BinaryDictionary32Builder = Dictionary32Builder<BinaryType>;
using StringDictionary32Builder = Dictionary32Builder<StringType>;
using LargeBinaryDictionary32Builder = Dictionary32Builder<LargeBinaryType>;
arthurpassos marked this conversation as resolved.
Show resolved Hide resolved

/// @}

Expand Down
2 changes: 2 additions & 0 deletions cpp/src/arrow/type.h
Original file line number Diff line number Diff line change
Expand Up @@ -678,6 +678,8 @@ class ARROW_EXPORT BaseBinaryType : public DataType {

constexpr int64_t kBinaryMemoryLimit = std::numeric_limits<int32_t>::max() - 1;

constexpr int64_t kLargeBinaryMemoryLimit = std::numeric_limits<int64_t>::max() - 1;

/// \addtogroup binary-datatypes
///
/// @{
Expand Down
3 changes: 2 additions & 1 deletion cpp/src/parquet/arrow/reader.cc
Original file line number Diff line number Diff line change
Expand Up @@ -219,6 +219,7 @@ class FileReaderImpl : public FileReader {
ctx->iterator_factory = SomeRowGroupsFactory(row_groups);
ctx->filter_leaves = true;
ctx->included_leaves = included_leaves;
ctx->use_binary_large_variants = reader_properties_.use_binary_large_variants();
return GetReader(manifest_.schema_fields[i], ctx, out);
}

Expand Down Expand Up @@ -462,7 +463,7 @@ class LeafReader : public ColumnReaderImpl {
input_(std::move(input)),
descr_(input_->descr()) {
record_reader_ = RecordReader::Make(
arthurpassos marked this conversation as resolved.
Show resolved Hide resolved
descr_, leaf_info, ctx_->pool, field_->type()->id() == ::arrow::Type::DICTIONARY);
descr_, leaf_info, ctx_->pool, field_->type()->id() == ::arrow::Type::DICTIONARY, false, ctx_->use_binary_large_variants);
arthurpassos marked this conversation as resolved.
Show resolved Hide resolved
NextRowGroup();
}

Expand Down
1 change: 1 addition & 0 deletions cpp/src/parquet/arrow/reader_internal.cc
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 +85,7 @@ using ::arrow::internal::SafeLeftShift;
using ::arrow::util::SafeLoadAs;

using parquet::internal::BinaryRecordReader;
using parquet::internal::LargeBinaryRecordReader;
using parquet::internal::DictionaryRecordReader;
using parquet::internal::RecordReader;
using parquet::schema::GroupNode;
Expand Down
1 change: 1 addition & 0 deletions cpp/src/parquet/arrow/reader_internal.h
Original file line number Diff line number Diff line change
Expand Up @@ -109,6 +109,7 @@ struct ReaderContext {
FileColumnIteratorFactory iterator_factory;
bool filter_leaves;
std::shared_ptr<std::unordered_set<int>> included_leaves;
bool use_binary_large_variants = false;
arthurpassos marked this conversation as resolved.
Show resolved Hide resolved

bool IncludesLeaf(int leaf_index) const {
if (this->filter_leaves) {
Expand Down
2 changes: 1 addition & 1 deletion cpp/src/parquet/arrow/schema.cc
Original file line number Diff line number Diff line change
Expand Up @@ -473,7 +473,7 @@ ::arrow::Result<std::shared_ptr<ArrowType>> GetTypeForNode(
SchemaTreeContext* ctx) {
ASSIGN_OR_RAISE(
std::shared_ptr<ArrowType> storage_type,
GetArrowType(primitive_node, ctx->properties.coerce_int96_timestamp_unit()));
GetArrowType(primitive_node, ctx->properties.coerce_int96_timestamp_unit(), ctx->properties.use_binary_large_variants()));
if (ctx->properties.read_dictionary(column_index) &&
IsDictionaryReadSupported(*storage_type)) {
return ::arrow::dictionary(::arrow::int32(), storage_type);
Expand Down
26 changes: 22 additions & 4 deletions cpp/src/parquet/arrow/schema_internal.cc
Original file line number Diff line number Diff line change
Expand Up @@ -127,6 +127,23 @@ Result<std::shared_ptr<ArrowType>> FromByteArray(const LogicalType& logical_type
}
}

Result<std::shared_ptr<ArrowType>> FromLargeByteArray(const LogicalType& logical_type) {
switch (logical_type.type()) {
case LogicalType::Type::STRING:
return ::arrow::large_utf8();
case LogicalType::Type::DECIMAL:
return MakeArrowDecimal(logical_type);
case LogicalType::Type::NONE:
case LogicalType::Type::ENUM:
case LogicalType::Type::JSON:
case LogicalType::Type::BSON:
return ::arrow::large_binary();
default:
return Status::NotImplemented("Unhandled logical logical_type ",
logical_type.ToString(), " for binary array");
}
}

Result<std::shared_ptr<ArrowType>> FromFLBA(const LogicalType& logical_type,
int32_t physical_length) {
switch (logical_type.type()) {
Expand Down Expand Up @@ -181,7 +198,7 @@ Result<std::shared_ptr<ArrowType>> FromInt64(const LogicalType& logical_type) {

Result<std::shared_ptr<ArrowType>> GetArrowType(
Type::type physical_type, const LogicalType& logical_type, int type_length,
const ::arrow::TimeUnit::type int96_arrow_time_unit) {
const ::arrow::TimeUnit::type int96_arrow_time_unit, bool use_binary_large_variant) {
if (logical_type.is_invalid() || logical_type.is_null()) {
return ::arrow::null();
}
Expand All @@ -200,7 +217,7 @@ Result<std::shared_ptr<ArrowType>> GetArrowType(
case ParquetType::DOUBLE:
return ::arrow::float64();
case ParquetType::BYTE_ARRAY:
return FromByteArray(logical_type);
return use_binary_large_variant ? FromLargeByteArray(logical_type) : FromByteArray(logical_type);
arthurpassos marked this conversation as resolved.
Show resolved Hide resolved
case ParquetType::FIXED_LEN_BYTE_ARRAY:
return FromFLBA(logical_type, type_length);
default: {
Expand All @@ -213,9 +230,10 @@ Result<std::shared_ptr<ArrowType>> GetArrowType(

Result<std::shared_ptr<ArrowType>> GetArrowType(
const schema::PrimitiveNode& primitive,
const ::arrow::TimeUnit::type int96_arrow_time_unit) {
const ::arrow::TimeUnit::type int96_arrow_time_unit,
bool use_binary_large_variant) {
return GetArrowType(primitive.physical_type(), *primitive.logical_type(),
primitive.type_length(), int96_arrow_time_unit);
primitive.type_length(), int96_arrow_time_unit, use_binary_large_variant);
}

} // namespace arrow
Expand Down
11 changes: 8 additions & 3 deletions cpp/src/parquet/arrow/schema_internal.h
Original file line number Diff line number Diff line change
Expand Up @@ -30,22 +30,27 @@ namespace arrow {
using ::arrow::Result;

Result<std::shared_ptr<::arrow::DataType>> FromByteArray(const LogicalType& logical_type);
Result<std::shared_ptr<::arrow::DataType>> FromLargeByteArray(const LogicalType& logical_type);

Result<std::shared_ptr<::arrow::DataType>> FromFLBA(const LogicalType& logical_type,
int32_t physical_length);
Result<std::shared_ptr<::arrow::DataType>> FromInt32(const LogicalType& logical_type);
Result<std::shared_ptr<::arrow::DataType>> FromInt64(const LogicalType& logical_type);

Result<std::shared_ptr<::arrow::DataType>> GetArrowType(Type::type physical_type,
const LogicalType& logical_type,
int type_length);
int type_length,
bool use_binary_large_variant = false);
arthurpassos marked this conversation as resolved.
Show resolved Hide resolved

Result<std::shared_ptr<::arrow::DataType>> GetArrowType(
Type::type physical_type, const LogicalType& logical_type, int type_length,
::arrow::TimeUnit::type int96_arrow_time_unit = ::arrow::TimeUnit::NANO);
::arrow::TimeUnit::type int96_arrow_time_unit = ::arrow::TimeUnit::NANO,
bool use_binary_large_variant = false);

Result<std::shared_ptr<::arrow::DataType>> GetArrowType(
const schema::PrimitiveNode& primitive,
::arrow::TimeUnit::type int96_arrow_time_unit = ::arrow::TimeUnit::NANO);
::arrow::TimeUnit::type int96_arrow_time_unit = ::arrow::TimeUnit::NANO,
bool use_binary_large_variant = false);

} // namespace arrow
} // namespace parquet
201 changes: 171 additions & 30 deletions cpp/src/parquet/column_reader.cc
Original file line number Diff line number Diff line change
Expand Up @@ -1988,33 +1988,33 @@ class TypedRecordReader : public TypedColumnReaderImpl<DType>,
}

void DebugPrintState() override {
const int16_t* def_levels = this->def_levels();
const int16_t* rep_levels = this->rep_levels();
const int64_t total_levels_read = levels_position_;

const T* vals = reinterpret_cast<const T*>(this->values());

if (leaf_info_.def_level > 0) {
std::cout << "def levels: ";
for (int64_t i = 0; i < total_levels_read; ++i) {
std::cout << def_levels[i] << " ";
}
std::cout << std::endl;
}

if (leaf_info_.rep_level > 0) {
std::cout << "rep levels: ";
for (int64_t i = 0; i < total_levels_read; ++i) {
std::cout << rep_levels[i] << " ";
}
std::cout << std::endl;
}

std::cout << "values: ";
for (int64_t i = 0; i < this->values_written(); ++i) {
std::cout << vals[i] << " ";
}
std::cout << std::endl;
// const int16_t* def_levels = this->def_levels();
arthurpassos marked this conversation as resolved.
Show resolved Hide resolved
// const int16_t* rep_levels = this->rep_levels();
// const int64_t total_levels_read = levels_position_;
//
// const T* vals = reinterpret_cast<const T*>(this->values());
//
// if (leaf_info_.def_level > 0) {
// std::cout << "def levels: ";
// for (int64_t i = 0; i < total_levels_read; ++i) {
// std::cout << def_levels[i] << " ";
// }
// std::cout << std::endl;
// }
//
// if (leaf_info_.rep_level > 0) {
// std::cout << "rep levels: ";
// for (int64_t i = 0; i < total_levels_read; ++i) {
// std::cout << rep_levels[i] << " ";
// }
// std::cout << std::endl;
// }
//
// std::cout << "values: ";
// for (int64_t i = 0; i < this->values_written(); ++i) {
//// std::cout << vals[i] << " ";
// }
// std::cout << std::endl;
}

void ResetValues() {
Expand Down Expand Up @@ -2135,6 +2135,48 @@ class ByteArrayChunkedRecordReader : public TypedRecordReader<ByteArrayType>,
typename EncodingTraits<ByteArrayType>::Accumulator accumulator_;
};

class LargeByteArrayChunkedRecordReader : public TypedRecordReader<LargeByteArrayType>,
virtual public LargeBinaryRecordReader {
public:
LargeByteArrayChunkedRecordReader(const ColumnDescriptor* descr, LevelInfo leaf_info,
::arrow::MemoryPool* pool, bool read_dense_for_nullable)
: TypedRecordReader<LargeByteArrayType>(descr, leaf_info, pool,
read_dense_for_nullable) {
ARROW_DCHECK_EQ(descr_->physical_type(), Type::BYTE_ARRAY);
accumulator_.builder = std::make_unique<::arrow::LargeBinaryBuilder>(pool);
}

::arrow::ArrayVector GetBuilderChunks() override {
::arrow::ArrayVector result = accumulator_.chunks;
if (result.size() == 0 || accumulator_.builder->length() > 0) {
std::shared_ptr<::arrow::Array> last_chunk;
PARQUET_THROW_NOT_OK(accumulator_.builder->Finish(&last_chunk));
result.push_back(std::move(last_chunk));
}
accumulator_.chunks = {};
return result;
}

void ReadValuesDense(int64_t values_to_read) override {
int64_t num_decoded = this->current_decoder_->DecodeArrowNonNull(
static_cast<int>(values_to_read), &accumulator_);
CheckNumberDecoded(num_decoded, values_to_read);
ResetValues();
}

void ReadValuesSpaced(int64_t values_to_read, int64_t null_count) override {
int64_t num_decoded = this->current_decoder_->DecodeArrow(
static_cast<int>(values_to_read), static_cast<int>(null_count),
valid_bits_->mutable_data(), values_written_, &accumulator_);
CheckNumberDecoded(num_decoded, values_to_read - null_count);
ResetValues();
}

private:
// Helper data structure for accumulating builder chunks
typename EncodingTraits<LargeByteArrayType>::Accumulator accumulator_;
};

class ByteArrayDictionaryRecordReader : public TypedRecordReader<ByteArrayType>,
virtual public DictionaryRecordReader {
public:
Expand Down Expand Up @@ -2217,6 +2259,88 @@ class ByteArrayDictionaryRecordReader : public TypedRecordReader<ByteArrayType>,
std::vector<std::shared_ptr<::arrow::Array>> result_chunks_;
};

class LargeByteArrayDictionaryRecordReader : public TypedRecordReader<LargeByteArrayType>,
virtual public DictionaryRecordReader {
public:
LargeByteArrayDictionaryRecordReader(const ColumnDescriptor* descr, LevelInfo leaf_info,
::arrow::MemoryPool* pool, bool read_dense_for_nullable)
: TypedRecordReader<LargeByteArrayType>(descr, leaf_info, pool, read_dense_for_nullable),
builder_(pool) {
this->read_dictionary_ = true;
}

std::shared_ptr<::arrow::ChunkedArray> GetResult() override {
FlushBuilder();
std::vector<std::shared_ptr<::arrow::Array>> result;
std::swap(result, result_chunks_);
return std::make_shared<::arrow::ChunkedArray>(std::move(result), builder_.type());
}

void FlushBuilder() {
if (builder_.length() > 0) {
std::shared_ptr<::arrow::Array> chunk;
PARQUET_THROW_NOT_OK(builder_.Finish(&chunk));
result_chunks_.emplace_back(std::move(chunk));

// Also clears the dictionary memo table
builder_.Reset();
}
}

void MaybeWriteNewDictionary() {
if (this->new_dictionary_) {
/// If there is a new dictionary, we may need to flush the builder, then
/// insert the new dictionary values
FlushBuilder();
builder_.ResetFull();
auto decoder = dynamic_cast<LargeBinaryDictDecoder*>(this->current_decoder_);
decoder->InsertDictionary(&builder_);
this->new_dictionary_ = false;
}
}

void ReadValuesDense(int64_t values_to_read) override {
int64_t num_decoded = 0;
if (current_encoding_ == Encoding::RLE_DICTIONARY) {
MaybeWriteNewDictionary();
auto decoder = dynamic_cast<LargeBinaryDictDecoder*>(this->current_decoder_);
num_decoded = decoder->DecodeIndices(static_cast<int>(values_to_read), &builder_);
} else {
num_decoded = this->current_decoder_->DecodeArrowNonNull(
static_cast<int>(values_to_read), &builder_);

/// Flush values since they have been copied into the builder
ResetValues();
}
CheckNumberDecoded(num_decoded, values_to_read);
}

void ReadValuesSpaced(int64_t values_to_read, int64_t null_count) override {
int64_t num_decoded = 0;
if (current_encoding_ == Encoding::RLE_DICTIONARY) {
MaybeWriteNewDictionary();
auto decoder = dynamic_cast<LargeBinaryDictDecoder*>(this->current_decoder_);
num_decoded = decoder->DecodeIndicesSpaced(
static_cast<int>(values_to_read), static_cast<int>(null_count),
valid_bits_->mutable_data(), values_written_, &builder_);
} else {
num_decoded = this->current_decoder_->DecodeArrow(
static_cast<int>(values_to_read), static_cast<int>(null_count),
valid_bits_->mutable_data(), values_written_, &builder_);

/// Flush values since they have been copied into the builder
ResetValues();
}
ARROW_DCHECK_EQ(num_decoded, values_to_read - null_count);
}

private:
using LargeBinaryDictDecoder = DictDecoder<LargeByteArrayType>;

::arrow::LargeBinaryDictionary32Builder builder_;
std::vector<std::shared_ptr<::arrow::Array>> result_chunks_;
};

// TODO(wesm): Implement these to some satisfaction
template <>
void TypedRecordReader<Int96Type>::DebugPrintState() {}
Expand All @@ -2241,12 +2365,27 @@ std::shared_ptr<RecordReader> MakeByteArrayRecordReader(const ColumnDescriptor*
}
}

std::shared_ptr<RecordReader> MakeLargeByteArrayRecordReader(const ColumnDescriptor* descr,
LevelInfo leaf_info,
::arrow::MemoryPool* pool,
bool read_dictionary,
bool read_dense_for_nullable) {
if (read_dictionary) {
return std::make_shared<LargeByteArrayDictionaryRecordReader>(descr, leaf_info, pool,
read_dense_for_nullable);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may use clang-format to format all changed files.

} else {
return std::make_shared<LargeByteArrayChunkedRecordReader>(
descr, leaf_info, pool, read_dense_for_nullable);
}
}

} // namespace

std::shared_ptr<RecordReader> RecordReader::Make(const ColumnDescriptor* descr,
LevelInfo leaf_info, MemoryPool* pool,
bool read_dictionary,
bool read_dense_for_nullable) {
bool read_dense_for_nullable,
bool use_binary_string_large_variants) {
arthurpassos marked this conversation as resolved.
Show resolved Hide resolved
switch (descr->physical_type()) {
case Type::BOOLEAN:
return std::make_shared<TypedRecordReader<BooleanType>>(descr, leaf_info, pool,
Expand All @@ -2267,8 +2406,10 @@ std::shared_ptr<RecordReader> RecordReader::Make(const ColumnDescriptor* descr,
return std::make_shared<TypedRecordReader<DoubleType>>(descr, leaf_info, pool,
read_dense_for_nullable);
case Type::BYTE_ARRAY: {
return MakeByteArrayRecordReader(descr, leaf_info, pool, read_dictionary,
read_dense_for_nullable);
return use_binary_string_large_variants ? MakeLargeByteArrayRecordReader(descr, leaf_info, pool, read_dictionary,
read_dense_for_nullable)
: MakeByteArrayRecordReader(descr, leaf_info, pool, read_dictionary,
read_dense_for_nullable);
}
case Type::FIXED_LEN_BYTE_ARRAY:
return std::make_shared<FLBARecordReader>(descr, leaf_info, pool,
Expand Down
Loading