-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Presto serde bug when deserializing large payload #11177
Conversation
✅ Deploy Preview for meta-velox canceled.
|
@tanjialiang has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
031af67
to
470a67e
Compare
@tanjialiang has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tanjialiang good catch. The fix looks good % minor. Thanks!
auto uncompress = | ||
codec->uncompress(compressBuf.get(), header.uncompressedSize); | ||
ByteRange byteRange{ | ||
uncompress->writableData(), (int32_t)uncompress->length(), 0}; | ||
std::vector<ByteRange> byteRanges; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if we have similar utility for this. Can you add a utility in velox/common/base/IOUtil.h for this with unit test? Like
std::vector<ByteRange> = writableByteRangesFromIOBuf?
@@ -4220,12 +4220,19 @@ void PrestoVectorSerde::deserialize( | |||
auto compressBuf = folly::IOBuf::create(header.compressedSize); | |||
source->readBytes(compressBuf->writableData(), header.compressedSize); | |||
compressBuf->append(header.compressedSize); | |||
|
|||
// Process chained uncompressed results IOBufs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, we never run into this issue in production? Or we haven't turn on compression? I think compression is at least enabled for spill.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
single prod buffer is small enough to not trigger this bug
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the size to trigger this? Can you mention in PR description? Thanks!
96e4d46
to
1bef62b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tanjialiang thanks for the update % nits.
velox/common/base/IOUtils.h
Outdated
namespace facebook::velox::common { | ||
|
||
std::vector<ByteRange> writableByteRangesFromIOBuf(folly::IOBuf* iobuf); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe just call it byteRangesFromIOBuf. It is used for input stream but byte range only take non-const pointer.
|
||
TEST_F(IOUtilsTest, iobufConsume) { | ||
const uint64_t bufCapacity = 1024; | ||
const uint64_t numChainedBuf = 64; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you test empty, one and multiple buffer cases? Thanks!
@@ -4220,12 +4220,19 @@ void PrestoVectorSerde::deserialize( | |||
auto compressBuf = folly::IOBuf::create(header.compressedSize); | |||
source->readBytes(compressBuf->writableData(), header.compressedSize); | |||
compressBuf->append(header.compressedSize); | |||
|
|||
// Process chained uncompressed results IOBufs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the size to trigger this? Can you mention in PR description? Thanks!
1bef62b
to
cd5f99b
Compare
cd5f99b
to
46f36f4
Compare
@tanjialiang has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
…tor#11177) Summary: codec gives back chained iobufs when payload is very large. We only take care of unchained scenario and was able to get away with it. Fix the case for large payload and added unit test for it. The estimated buffer size to trigger this bug is around 80MB in GZIP, ZSTD and ZLIB Reviewed By: xiaoxmeng Differential Revision: D63962642 Pulled By: tanjialiang
46f36f4
to
e8efc11
Compare
This pull request was exported from Phabricator. Differential Revision: D63962642 |
e8efc11
to
0451b14
Compare
@tanjialiang has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@tanjialiang merged this pull request in 22d73a0. |
Conbench analyzed the 1 benchmark run on commit There were no benchmark performance regressions. 🎉 The full Conbench report has more details. |
…tor#11177) Summary: codec gives back chained iobufs when payload is very large. We only take care of unchained scenario and was able to get away with it. Fix the case for large payload and added unit test for it. The estimated buffer size to trigger this bug is around 80MB in GZIP, ZSTD and ZLIB Pull Request resolved: facebookincubator#11177 Reviewed By: xiaoxmeng Differential Revision: D63962642 Pulled By: tanjialiang fbshipit-source-id: cbaf2bb5518de786c69461b5f8d725732c9f6fe8
codec gives back chained iobufs when payload is very large. We only take care of unchained scenario and was able to get away with it. Fix the case for large payload and added unit test for it. The estimated buffer size to trigger this bug is around 80MB in GZIP, ZSTD and ZLIB