Skip to content

Commit

Permalink
Add serialization API for set of rows
Browse files Browse the repository at this point in the history
Top level rows are more efficiently serialized in row sets rather than
arrays or ranges. Arrays of ranges are still useful for repeated
nested content.  The row set path can uses SIMD to gather nulls and
extract idices of non-null values for serialization.

A Scratch objett is added to signatures to pass reusable scratch
memory also for top level calls to range serializing serializatin
functions. This can remove malloc use for temporary vectors.

The API is tested standalone but is not connected to running code, so
this diff does not affect running systens.
  • Loading branch information
Orri Erling committed Dec 11, 2023
1 parent 4f95700 commit f3120d6
Show file tree
Hide file tree
Showing 13 changed files with 1,414 additions and 65 deletions.
6 changes: 4 additions & 2 deletions velox/serializers/CompactRowSerializer.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,8 @@ namespace facebook::velox::serializer {
void CompactRowVectorSerde::estimateSerializedSize(
VectorPtr /* vector */,
const folly::Range<const IndexRange*>& /* ranges */,
vector_size_t** /* sizes */) {
vector_size_t** /* sizes */,
Scratch& /*scratch*/) {
VELOX_UNSUPPORTED();
}

Expand All @@ -36,7 +37,8 @@ class CompactRowVectorSerializer : public VectorSerializer {

void append(
const RowVectorPtr& vector,
const folly::Range<const IndexRange*>& ranges) override {
const folly::Range<const IndexRange*>& ranges,
Scratch& scratch) override {
size_t totalSize = 0;
row::CompactRow row(vector);
if (auto fixedRowSize =
Expand Down
3 changes: 2 additions & 1 deletion velox/serializers/CompactRowSerializer.h
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,8 @@ class CompactRowVectorSerde : public VectorSerde {
void estimateSerializedSize(
VectorPtr vector,
const folly::Range<const IndexRange*>& ranges,
vector_size_t** sizes) override;
vector_size_t** sizes,
Scratch& scratch) override;

// This method is not used in production code. It is only used to
// support round-trip tests for deserialization.
Expand Down
Loading

0 comments on commit f3120d6

Please sign in to comment.