Skip to content

Commit

Permalink
Copy input in json_parse to avoid ASAN error (#7858)
Browse files Browse the repository at this point in the history
Summary:

`value` is put on offset 32 of the stack frame, and the total frame size is 88.  In case `value` is inlined, and SIMD register is 64 bytes (e.g. AVX512), simdjson is reading the memory from offset 36 to 100, which exceeds the frame boundary.

This caused some ASAN error, but no issue in production, because the padding data is not really used or changed.

Fix it by asking `simdjson::dom::parser::parse` to copy the input to internal padded memory owned by parser.  There is some minor performance loss but is probably not noticeable at query level.

Reviewed By: mbasmanova

Differential Revision: D51810542
  • Loading branch information
Yuhta authored and facebook-github-bot committed Dec 5, 2023
1 parent 975ca3a commit 28a033c
Show file tree
Hide file tree
Showing 2 changed files with 6 additions and 3 deletions.
2 changes: 2 additions & 0 deletions velox/dwio/common/SelectiveRepeatedColumnReader.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -318,7 +318,9 @@ void SelectiveMapColumnReader::read(
makeNestedRowSet(activeRows, rows.back());
if (keyReader_ && elementReader_ && !nestedRows_.empty()) {
keyReader_->read(keyReader_->readOffset(), nestedRows_, nullptr);
LOG(INFO) << nestedRows_.size();
nestedRows_ = keyReader_->outputRows();
LOG(INFO) << nestedRows_.size();
if (!nestedRows_.empty()) {
elementReader_->read(elementReader_->readOffset(), nestedRows_, nullptr);
}
Expand Down
7 changes: 4 additions & 3 deletions velox/functions/prestosql/JsonFunctions.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -86,10 +86,10 @@ class JsonParseFunction : public exec::VectorFunction {
// validation of JSON syntax that doesn't allocate memory or copy data.
assert(args.size() > 0);
const auto& arg = args[0];
static_assert(simdjson::SIMDJSON_PADDING <= AlignedBuffer::kPaddedSize);
if (arg->isConstantEncoding()) {
auto value = arg->as<ConstantVector<StringView>>()->valueAt(0);
auto parsed = parser_.parse(value.data(), value.size(), false);
// Still need to copy the input in case `value' is inlined.
auto parsed = parser_.parse(value.data(), value.size(), true);
if (parsed.error() != simdjson::SUCCESS) {
context.setErrors(rows, errors_[parsed.error()]);
return;
Expand All @@ -104,7 +104,8 @@ class JsonParseFunction : public exec::VectorFunction {

rows.applyToSelected([&](auto row) {
auto value = flatInput->valueAt(row);
auto parsed = parser_.parse(value.data(), value.size(), false);
// Still need to copy the input in case `value' is inlined.
auto parsed = parser_.parse(value.data(), value.size(), true);
if (parsed.error() != simdjson::SUCCESS) {
context.setVeloxExceptionError(row, errors_[parsed.error()]);
}
Expand Down

0 comments on commit 28a033c

Please sign in to comment.