Add serialization API for set of rows #7883

oerling · 2023-12-05T18:36:54Z

Top level rows are more efficiently serialized in row sets rather than arrays or ranges. Arrays of ranges are still useful for repeated nested content. The row set path can uses SIMD to gather nulls and extract idices of non-null values for serialization.

A Scratch objett is added to signatures to pass reusable scratch memory also for top level calls to range serializing serializatin functions. This can remove malloc use for temporary vectors.

The API is tested standalone but is not connected to running code, so this diff does not affect running systens.

netlify · 2023-12-05T18:37:00Z

✅ Deploy Preview for meta-velox canceled.

Name	Link
🔨 Latest commit	`26b000a`
🔍 Latest deploy log	https://app.netlify.com/sites/meta-velox/deploys/65773ad8d939e40008f4e6c9

facebook-github-bot · 2023-12-05T18:44:10Z

@oerling has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

velox/serializers/PrestoSerializer.cpp

Yuhta · 2023-12-08T15:12:43Z

velox/serializers/PrestoSerializer.cpp

+    auto nulls = nullsHolder.get(bits::nwords(numRows));
+    simd::gatherBits(rawNulls, rows, nulls);
+    auto nonNulls = nonNullsHolder.get(numRows);
+    const auto numNonNull = simd::indicesOfSetBits(nulls, 0, numRows, nonNulls);


Add a comment that we expect mostly set bits so prefer materializing indices to maximize instruction level parallelism

Yuhta · 2023-12-08T15:24:16Z

velox/serializers/tests/PrestoSerializerTest.cpp

    auto size = serializer->maxSerializedSize();
+    LOG(INFO) << "Size=" << size << " estimate=" << sizeEstimate << " "


Remove logging or turn it into an assertion

velox/serializers/tests/PrestoSerializerTest.cpp

Yuhta · 2023-12-08T15:37:08Z

@oerling There are also some tests failing in the internal builds

oerling · 2023-12-10T17:31:52Z

Edited it. From: Jimmy Lu ***@***.***> Sent: Friday, December 8, 2023 7:36 AM To: facebookincubator/velox ***@***.***> Cc: oerling ***@***.***>; Mention ***@***.***> Subject: Re: [facebookincubator/velox] Add serialization API for set of rows (PR #7883) @Yuhta commented on this pull request.

_____ In velox/serializers/PrestoSerializer.cpp <#7883 (comment)> :

+ uint8_t nullsByte = *reinterpret_cast<const uint8_t*>(nulls);

+ numNonNull = __builtin_popcount(nullsByte); + nonNullIndices = + numNonNull == numRows ? nullptr : simd::byteSetBits(nullsByte); + } else { + auto mutableIndices = nonNullHolder.get(numRows); + numNonNull = simd::indicesOfSetBits(nulls, 0, numRows, mutableIndices); + nonNullIndices = numNonNull == numRows ? nullptr : mutableIndices; + } + stream->appendNulls(nulls, 0, rows.size(), numNonNull); + ByteStream& out = stream->values(); + + if constexpr (sizeof(T) == 8) { + AppendWindow<int64_t> window(out, scratch); + int64_t* output = window.get(numNonNull); + if (numNonNull == numRows) { Remove this 2 lines

_____ In velox/serializers/PrestoSerializer.cpp <#7883 (comment)> :

+template <TypeKind Kind>

+void estimateFlatSerializedSize( + const BaseVector* vector, + const folly::Range<const vector_size_t*>& rows, + vector_size_t** sizes, + Scratch& scratch) { + const auto valueSize = vector->type()->cppSizeInBytes(); + const auto numRows = rows.size(); + if (vector->mayHaveNulls()) { + auto rawNulls = vector->rawNulls(); + ScratchPtr<uint64_t, 4> nullsHolder(scratch); + ScratchPtr<int32_t, 64> nonNullsHolder(scratch); + auto nulls = nullsHolder.get(bits::nwords(numRows)); + simd::gatherBits(rawNulls, rows, nulls); + auto nonNulls = nonNullsHolder.get(numRows); + const auto numNonNull = simd::indicesOfSetBits(nulls, 0, numRows, nonNulls); Add a comment that we expect mostly set bits so prefer materializing indices to maximize instruction level parallelism

_____ In velox/serializers/tests/PrestoSerializerTest.cpp <#7883 (comment)> :

auto size = serializer->maxSerializedSize();

+ LOG(INFO) << "Size=" << size << " estimate=" << sizeEstimate << " " Remove logging or turn it into an assertion

_____ In velox/serializers/tests/PrestoSerializerTest.cpp <#7883 (comment)> :

+ uint64_t irTime{0};

+ uint64_t rrTime{0}; + + std::string toString() { + return fmt::format( + "{} of {} {} bit {}%null: {} ir / {} rr", + numSelected, + vectorSize, + bits, + nullPct, + irTime, + rrTime); + } +}; + +TEST_P(PrestoSerializerTest, timeFlat) { We probably should split this into its own benchmark file, there is no validation or dependency on the rest of the file. — Reply to this email directly, view it on GitHub <#7883 (review)> , or unsubscribe <https://github.com/notifications/unsubscribe-auth/AKPPPT666J6H74WF34SVMF3YIMXU3AVCNFSM6AAAAABAIEX4C6VHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMYTONZSGUZDOMBRG4> . You are receiving this because you were mentioned.Message ID: ***@***.***>

facebook-github-bot · 2023-12-10T20:15:22Z

@oerling has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot · 2023-12-11T01:44:36Z

@oerling has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot · 2023-12-11T04:16:51Z

@oerling has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Top level rows are more efficiently serialized in row sets rather than arrays or ranges. Arrays of ranges are still useful for repeated nested content. The row set path can uses SIMD to gather nulls and extract idices of non-null values for serialization. A Scratch objett is added to signatures to pass reusable scratch memory also for top level calls to range serializing serializatin functions. This can remove malloc use for temporary vectors. The API is tested standalone but is not connected to running code, so this diff does not affect running systens.

facebook-github-bot · 2023-12-12T01:35:58Z

@oerling merged this pull request in 25398f2.

conbench-facebook · 2023-12-12T02:00:16Z

Conbench analyzed the 1 benchmark run on commit 25398f21.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details.

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Dec 5, 2023

oerling requested a review from Yuhta December 5, 2023 18:45

Yuhta mentioned this pull request Dec 8, 2023

Optimize Presto Serialization #7565

Closed

Yuhta requested a review from xiaoxmeng December 8, 2023 15:17

Yuhta reviewed Dec 8, 2023

View reviewed changes

oerling force-pushed the row-ser-pr branch from 45d870c to f3120d6 Compare December 11, 2023 01:31

oerling force-pushed the row-ser-pr branch from f3120d6 to 744fc8a Compare December 11, 2023 04:15

oerling force-pushed the row-ser-pr branch from 744fc8a to 26b000a Compare December 11, 2023 16:37

facebook-github-bot closed this in 25398f2 Dec 12, 2023

facebook-github-bot added the Merged label Dec 12, 2023

icejoywoo mentioned this pull request Jan 14, 2024

SimdUtilTest.gatherBits does not work for arm64(M1) #8377

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add serialization API for set of rows #7883

Add serialization API for set of rows #7883

oerling commented Dec 5, 2023

netlify bot commented Dec 5, 2023 •

edited

Loading

facebook-github-bot commented Dec 5, 2023

Yuhta Dec 8, 2023

Yuhta Dec 8, 2023

Yuhta commented Dec 8, 2023

oerling commented Dec 10, 2023 via email

facebook-github-bot commented Dec 10, 2023

facebook-github-bot commented Dec 11, 2023

facebook-github-bot commented Dec 11, 2023

facebook-github-bot commented Dec 12, 2023

conbench-facebook bot commented Dec 12, 2023

		auto size = serializer->maxSerializedSize();
		LOG(INFO) << "Size=" << size << " estimate=" << sizeEstimate << " "

Add serialization API for set of rows #7883

Add serialization API for set of rows #7883

Conversation

oerling commented Dec 5, 2023

netlify bot commented Dec 5, 2023 • edited Loading

✅ Deploy Preview for meta-velox canceled.

facebook-github-bot commented Dec 5, 2023

Yuhta Dec 8, 2023

Choose a reason for hiding this comment

Yuhta Dec 8, 2023

Choose a reason for hiding this comment

Yuhta commented Dec 8, 2023

oerling commented Dec 10, 2023 via email

facebook-github-bot commented Dec 10, 2023

facebook-github-bot commented Dec 11, 2023

facebook-github-bot commented Dec 11, 2023

facebook-github-bot commented Dec 12, 2023

conbench-facebook bot commented Dec 12, 2023

netlify bot commented Dec 5, 2023 •

edited

Loading