Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check validation of of bit offset when reading bit packed values #66

Merged
merged 1 commit into from
May 21, 2024

Conversation

copperybean
Copy link

Rationale for this change

At first, there isn't validation check before right shift, while *bit_offset comes from file data, which may be greater or equal than 64, the reading result may be unexpected in this case. So it's better to throw an exception instead of giving an incorrect result.

Indeed, it is found by ClickHouse special build check in PR. The detailed error message is:

log content

May 20 14:13:58 FAILED: src/CMakeFiles/dbms.dir/Processors/Formats/Impl/Parquet/ParquetDataValuesReader.cpp.o
May 20 14:13:58 /usr/bin/cmake -E _run_co_compile --launcher="prlimit;--as=10000000000;--data=5000000000;--cpu=1000;/usr/bin/sccache" --tidy="/usr/bin/clang-tidy-cache;/usr/bin/clang-tidy-18;--extra-arg-before=--driver-mode=g++" --source=/build/src/Processors/Formats/Impl/Parquet/ParquetDataValuesReader.cpp -- /usr/bin/clang++-18 --target=x86_64-linux-gnu --sysroot=/build/cmake/linux/../../contrib/sysroot/linux-x86_64/x86_64-linux-gnu/libc -DANNOYLIB_MULTITHREADED_BUILD -DBOOST_ASIO_HAS_STD_INVOKE_RESULT=1 -DBOOST_ASIO_STANDALONE=1 -DBOOST_TIMER_ENABLE_DEPRECATED=1 -DCARES_STATICLIB -DCONFIGDIR="" -DDUMMY_BACKTRACE -DENABLE_ANNOY -DENABLE_MULTITARGET_CODE=1 -DENABLE_QPL_COMPRESSION -DENABLE_USEARCH -DENABLE_ZSTD_QAT_CODEC -DHAVE_BZLIB_H=1 -DHAVE_CONFIG_H -DHAVE_FUTIMESAT=1 -DHAVE_ICONV=1 -DHAVE_LIBLZMA=1 -DHAVE_LIBZSTD=1 -DHAVE_LIBZSTD_COMPRESSOR=1 -DHAVE_LINUX_FS_H=1 -DHAVE_LINUX_TYPES_H=1 -DHAVE_LZMA_H=1 -DHAVE_STRUCT_STAT_ST_MTIM_TV_NSEC=1 -DHAVE_SYS_STATFS_H=1 -DHAVE_ZLIB_H=1 -DHAVE_ZSTD_H=1 -DINCBIN_SILENCE_BITCODE_WARNING -DINTREE -DLIBSASL_EXPORTS=1 -DLZ4_DISABLE_DEPRECATE_WARNINGS=1 -DLZ4_FAST_DEC_LOOP=1 -DMAJOR_IN_SYSMACROS=1 -DOBSOLETE_CRAM_ATTR=1 -DOBSOLETE_DIGEST_ATTR=1 -DPLUGINDIR="" -DPOCO_ENABLE_CPP11 -DPOCO_HAVE_FD_EPOLL -DPOCO_OS_FAMILY_UNIX -DSASLAUTHD_CONF_FILE_DEFAULT="" -DSNAPPY_CODEC_AVAILABLE -DSTD_EXCEPTION_HAS_STACK_TRACE=1 -DUNALIGNED_OK -DUSE_CLICKHOUSE_THREADS=1 -DWITH_COVERAGE=0 -DWITH_GZFILEOP -DX86_64 -DZLIB_COMPAT -D_LIBCPP_ENABLE_THREAD_SAFETY_ANNOTATIONS -D_LIBUNWIND_IS_NATIVE_ONLY -I/build/build_docker/includes/configs -I/build/src -I/build/build_docker/src -I/build/build_docker/src/Core/include -I/build/build_docker/rust/workspace/skim/include -I/build/base/glibc-compatibility/memcpy -I/build/base/base/.. -I/build/build_docker/base/base/.. -I/build/contrib/cctz/include -I/build/contrib/re2 -I/build/base/pcg-random/. -I/build/contrib/libfiu/libfiu -I/build/contrib/libssh/include -I/build/build_docker/contrib/libssh/include -I/build/contrib/miniselect/include -I/build/contrib/zstd/lib -I/build/contrib/pocketfft -I/build/contrib/libarchive-cmake -I/build/contrib/libarchive/libarchive -I/build/build_docker/contrib/cyrus-sasl-cmake -I/build/contrib/lz4/lib -I/build/src/Common/mysqlxx/. -isystem /build/build_docker/contrib/orc/c++/include -isystem /build/contrib/llvm-project/libcxx/include -isystem /build/contrib/llvm-project/libcxxabi/include -isystem /build/contrib/libunwind/include -isystem /build/contrib/libdivide-cmake/. -isystem /build/contrib/libdivide -isystem /build/contrib/jemalloc-cmake/include -isystem /build/contrib/llvm-project/llvm/include -isystem /build/build_docker/contrib/llvm-project/llvm/include -isystem /build/contrib/abseil-cpp -isystem /build/contrib/croaring/cpp -isystem /build/contrib/croaring/include -isystem /build/contrib/sparsehash-c11 -isystem /build/contrib/incbin -isystem /build/contrib/cityhash102/include -isystem /build/contrib/boost -isystem /build/base/poco/Net/include -isystem /build/base/poco/Foundation/include -isystem /build/base/poco/NetSSL_OpenSSL/include -isystem /build/base/poco/Crypto/include -isystem /build/contrib/openssl-cmake/linux_x86_64/include -isystem /build/contrib/openssl/include -isystem /build/base/poco/Util/include -isystem /build/base/poco/JSON/include -isystem /build/base/poco/XML/include -isystem /build/contrib/replxx/include -isystem /build/contrib/fmtlib-cmake/../fmtlib/include -isystem /build/contrib/magic_enum/include -isystem /build/contrib/double-conversion -isystem /build/contrib/dragonbox/include -isystem /build/contrib/zlib-ng -isystem /build/build_docker/contrib/zlib-ng-cmake -isystem /build/contrib/pdqsort -isystem /build/contrib/xz/src/liblzma/api -isystem /build/contrib/aws/src/aws-cpp-sdk-core/include -isystem /build/build_docker/contrib/aws-cmake/include -isystem /build/contrib/aws/generated/src/aws-cpp-sdk-s3/include -isystem /build/contrib/aws-c-auth/include -isystem /build/contrib/aws-c-common/include -isystem /build/contrib/aws-c-io/include -isystem /build/contrib/aws-crt-cpp/include -isystem /build/contrib/aws-c-mqtt/include -isystem /build/contrib/aws-c-sdkutils/include -isystem /build/contrib/azure/sdk/core/azure-core/inc -isystem /build/contrib/azure/sdk/identity/azure-identity/inc -isystem /build/contrib/azure/sdk/storage/azure-storage-common/inc -isystem /build/contrib/azure/sdk/storage/azure-storage-blobs/inc -isystem /build/contrib/snappy -isystem /build/build_docker/contrib/snappy-cmake -isystem /build/contrib/libbcrypt -isystem /build/contrib/msgpack-c/include -isystem /build/build_docker/contrib/liburing/src/include-compat -isystem /build/build_docker/contrib/liburing/src/include -isystem /build/contrib/liburing/src/include -isystem /build/contrib/fast_float/include -isystem /build/contrib/QAT-ZSTD-Plugin/src -isystem /build/contrib/librdkafka-cmake/include -isystem /build/contrib/librdkafka/src -isystem /build/build_docker/contrib/librdkafka-cmake/auxdir -isystem /build/contrib/cppkafka/include -isystem /build/contrib/nats-io/src -isystem /build/contrib/nats-io/src/adapters -isystem /build/contrib/nats-io/src/include -isystem /build/contrib/nats-io/src/unix -isystem /build/contrib/libuv/include -isystem /build/contrib/krb5/src/include -isystem /build/build_docker/contrib/krb5-cmake/include -isystem /build/contrib/NuRaft/include -isystem /build/base/poco/MongoDB/include -isystem /build/base/poco/Redis/include -isystem /build/build_docker/contrib/mariadb-connector-c-cmake/include-public -isystem /build/contrib/mariadb-connector-c/include -isystem /build/contrib/mariadb-connector-c/libmariadb -isystem /build/contrib/icu/icu4c/source/i18n -isystem /build/contrib/icu/icu4c/source/common -isystem /build/contrib/capnproto/c++/src -isystem /build/contrib/arrow/cpp/src -isystem /build/contrib/arrow-cmake/cpp/src -isystem /build/build_docker/contrib/arrow-cmake/../orc/c++/include -isystem /build/contrib/orc/c++/include -isystem /build/contrib/arrow-cmake/cpp/src/orc/c++/include -isystem /build/contrib/thrift/lib/cpp/src -isystem /build/build_docker/contrib/thrift-cmake -isystem /build/contrib/avro/lang/c++/api -isystem /build/contrib/openldap-cmake/linux_x86_64/include -isystem /build/contrib/openldap/include -isystem /build/contrib/google-protobuf/src -isystem /build/build_docker/src/Server/grpc_protos -isystem /build/contrib/grpc/include -isystem /build/contrib/c-ares/src/lib -isystem /build/contrib/c-ares/include -isystem /build/contrib/c-ares-cmake/linux -isystem /build/contrib/libhdfs3/include -isystem /build/contrib/hive-metastore -isystem /build/contrib/s2geometry/src -isystem /build/contrib/s2geometry-cmake -isystem /build/contrib/vectorscan/src -isystem /build/contrib/AMQP-CPP/include -isystem /build/contrib/AMQP-CPP -isystem /build/contrib/sqlite-amalgamation -isystem /build/contrib/rocksdb/include -isystem /build/contrib/libpqxx/include -isystem /build/contrib/libpq -isystem /build/contrib/libpq/include -isystem /build/contrib/qpl-cmake -isystem /build/contrib/qpl/include -isystem /build/contrib/idxd-config/accfg -isystem /build/contrib/libstemmer_c/include -isystem /build/contrib/wordnet-blast -isystem /build/contrib/lemmagen-c/include -isystem /build/contrib/ulid-c/include -isystem /build/contrib/simdjson/include -isystem /build/contrib/rapidjson/include -isystem /build/contrib/consistent-hashing -isystem /build/contrib/annoy/src -isystem /build/contrib/FP16/include -isystem /build/contrib/robin-map/include -isystem /build/contrib/SimSIMD-map/include -isystem /build/contrib/usearch/include --gcc-toolchain=/build/cmake/linux/../../contrib/sysroot/linux-x86_64 -fdiagnostics-color=always -Xclang -fuse-ctor-homing -Wno-enum-constexpr-conversion -fsized-deallocation -UNDEBUG -gdwarf-aranges -pipe -mssse3 -msse4.1 -msse4.2 -mpclmul -mpopcnt -fasynchronous-unwind-tables -ftime-trace -falign-functions=32 -mbranches-within-32B-boundaries -ffp-contract=off -fdiagnostics-absolute-paths -fstrict-vtable-pointers -Wall -Wextra -Weverything -Wpedantic -Wno-zero-length-array -Wno-c++98-compat-pedantic -Wno-c++98-compat -Wno-c++20-compat -Wno-sign-conversion -Wno-implicit-int-conversion -Wno-implicit-int-float-conversion -Wno-ctad-maybe-unsupported -Wno-disabled-macro-expansion -Wno-documentation-unknown-command -Wno-double-promotion -Wno-exit-time-destructors -Wno-float-equal -Wno-global-constructors -Wno-missing-prototypes -Wno-missing-variable-declarations -Wno-padded -Wno-switch-enum -Wno-undefined-func-template -Wno-unused-template -Wno-vla -Wno-weak-template-vtables -Wno-weak-vtables -Wno-thread-safety-negative -Wno-enum-constexpr-conversion -Wno-unsafe-buffer-usage -Wno-switch-default -g -O0 -g -D_LIBCPP_DEBUG=0 -std=c++23 -D OS_LINUX -Werror -Wno-deprecated-declarations -Wno-poison-system-directories -nostdinc++ -MD -MT src/CMakeFiles/dbms.dir/Processors/Formats/Impl/Parquet/ParquetDataValuesReader.cpp.o -MF src/CMakeFiles/dbms.dir/Processors/Formats/Impl/Parquet/ParquetDataValuesReader.cpp.o.d -o src/CMakeFiles/dbms.dir/Processors/Formats/Impl/Parquet/ParquetDataValuesReader.cpp.o -c /build/src/Processors/Formats/Impl/Parquet/ParquetDataValuesReader.cpp
May 20 14:13:58 /build/contrib/arrow/cpp/src/arrow/util/bit_stream_utils.h:280:88: error: Right shift overflows the capacity of 'uint64_t' [clang-analyzer-core.BitwiseShift,-warnings-as-errors]
May 20 14:13:58 280 | *v = static_cast(bit_util::TrailingBits(*buffered_values, *bit_offset + num_bits) >>
May 20 14:13:58 | ^
May 20 14:13:58 /build/src/Processors/Formats/Impl/Parquet/ParquetDataValuesReader.cpp:139:12: note: Assuming 'num_values' is > 0
May 20 14:13:58 139 | while (num_values > 0)
May 20 14:13:58 | ^~~~~~~~~~~~~~
May 20 14:13:58 /build/src/Processors/Formats/Impl/Parquet/ParquetDataValuesReader.cpp:139:5: note: Loop condition is true. Entering loop body
May 20 14:13:58 139 | while (num_values > 0)
May 20 14:13:58 | ^
May 20 14:13:58 /build/src/Processors/Formats/Impl/Parquet/ParquetDataValuesReader.cpp:141:9: note: Calling 'RleValuesReader::nextGroupIfNecessary'
May 20 14:13:58 141 | nextGroupIfNecessary();
May 20 14:13:58 | ^~~~~~~~~~~~~~~~~~~~~~
May 20 14:13:58 /build/src/Processors/Formats/Impl/Parquet/ParquetDataValuesReader.h:32:39: note: Assuming field 'cur_group_cursor' is >= field 'cur_group_size'
May 20 14:13:58 32 | void nextGroupIfNecessary() { if (cur_group_cursor >= cur_group_size) nextGroup(); }
May 20 14:13:58 | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
May 20 14:13:58 /build/src/Processors/Formats/Impl/Parquet/ParquetDataValuesReader.h:32:35: note: Taking true branch
May 20 14:13:58 32 | void nextGroupIfNecessary() { if (cur_group_cursor >= cur_group_size) nextGroup(); }
May 20 14:13:58 | ^
May 20 14:13:58 /build/src/Processors/Formats/Impl/Parquet/ParquetDataValuesReader.h:32:75: note: Calling 'RleValuesReader::nextGroup'
May 20 14:13:58 32 | void nextGroupIfNecessary() { if (cur_group_cursor >= cur_group_size) nextGroup(); }
May 20 14:13:58 | ^~~~~~~~~~~
May 20 14:13:58 /build/src/Processors/Formats/Impl/Parquet/ParquetDataValuesReader.cpp:34:12: note: 'read_res' is true
May 20 14:13:58 34 | assert(read_res);
May 20 14:13:58 | ^
May 20 14:13:58 /build/cmake/linux/../../contrib/sysroot/linux-x86_64/x86_64-linux-gnu/libc/usr/include/assert.h:93:27: note: expanded from macro 'assert'
May 20 14:13:58 93 | (static_cast (expr)
May 20 14:13:58 | ^~~~
May 20 14:13:58 /build/src/Processors/Formats/Impl/Parquet/ParquetDataValuesReader.cpp:34:5: note: '?' condition is true
May 20 14:13:58 34 | assert(read_res);
May 20 14:13:58 | ^
May 20 14:13:58 /build/cmake/linux/../../contrib/sysroot/linux-x86_64/x86_64-linux-gnu/libc/usr/include/assert.h:93:7: note: expanded from macro 'assert'
May 20 14:13:58 93 | (static_cast (expr)
May 20 14:13:58 | ^
May 20 14:13:58 /build/src/Processors/Formats/Impl/Parquet/ParquetDataValuesReader.cpp:39:9: note: Assuming field 'cur_group_is_packed' is true
May 20 14:13:58 39 | if (cur_group_is_packed)
May 20 14:13:58 | ^~~~~~~~~~~~~~~~~~~
May 20 14:13:58 /build/src/Processors/Formats/Impl/Parquet/ParquetDataValuesReader.cpp:39:5: note: Taking true branch
May 20 14:13:58 39 | if (cur_group_is_packed)
May 20 14:13:58 | ^
May 20 14:13:58 /build/src/Processors/Formats/Impl/Parquet/ParquetDataValuesReader.cpp:43:9: note: Calling 'BitReader::GetBatch'
May 20 14:13:58 43 | bit_reader->GetBatch(bit_width, cur_packed_bit_values.data(), cur_group_size);
May 20 14:13:58 | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
May 20 14:13:58 /build/contrib/arrow/cpp/src/arrow/util/bit_stream_utils.h:319:10: note: Assuming the condition is false
May 20 14:13:58 319 | DCHECK(buffer
!= NULL);
May 20 14:13:58 | ^
May 20 14:13:58 /build/contrib/arrow/cpp/src/arrow/util/logging.h:140:16: note: expanded from macro 'DCHECK'
May 20 14:13:58 140 | #define DCHECK ARROW_DCHECK
May 20 14:13:58 | ^
May 20 14:13:58 /build/contrib/arrow/cpp/src/arrow/util/logging.h:129:22: note: expanded from macro 'ARROW_DCHECK'
May 20 14:13:58 129 | #define ARROW_DCHECK ARROW_CHECK
May 20 14:13:58 | ^
May 20 14:13:58 /build/contrib/arrow/cpp/src/arrow/util/logging.h:66:51: note: expanded from macro 'ARROW_CHECK'
May 20 14:13:58 66 | #define ARROW_CHECK(condition) ARROW_CHECK_OR_LOG(condition, FATAL)
May 20 14:13:58 | ~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~
May 20 14:13:58 /build/contrib/arrow/cpp/src/arrow/util/logging.h:62:22: note: expanded from macro 'ARROW_CHECK_OR_LOG'
May 20 14:13:58 62 | ARROW_PREDICT_TRUE(condition)
May 20 14:13:58 | ~~~~~~~~~~~~~~~~~~~^~~~~~~~~~
May 20 14:13:58 /build/contrib/arrow/cpp/src/arrow/util/macros.h:49:52: note: expanded from macro 'ARROW_PREDICT_TRUE'
May 20 14:13:58 49 | #define ARROW_PREDICT_TRUE(x) (_builtin_expect(!!(x), 1))
May 20 14:13:58 | ^
May 20 14:13:58 /build/contrib/arrow/cpp/src/arrow/util/bit_stream_utils.h:319:3: note: '?' condition is false
May 20 14:13:58 319 | DCHECK(buffer
!= NULL);
May 20 14:13:58 | ^
May 20 14:13:58 /build/contrib/arrow/cpp/src/arrow/util/logging.h:140:16: note: expanded from macro 'DCHECK'
May 20 14:13:58 140 | #define DCHECK ARROW_DCHECK
May 20 14:13:58 | ^
May 20 14:13:58 /build/contrib/arrow/cpp/src/arrow/util/logging.h:129:22: note: expanded from macro 'ARROW_DCHECK'
May 20 14:13:58 129 | #define ARROW_DCHECK ARROW_CHECK
May 20 14:13:58 | ^
May 20 14:13:58 /build/contrib/arrow/cpp/src/arrow/util/logging.h:66:32: note: expanded from macro 'ARROW_CHECK'
May 20 14:13:58 66 | #define ARROW_CHECK(condition) ARROW_CHECK_OR_LOG(condition, FATAL)
May 20 14:13:58 | ^
May 20 14:13:58 /build/contrib/arrow/cpp/src/arrow/util/logging.h:62:3: note: expanded from macro 'ARROW_CHECK_OR_LOG'
May 20 14:13:58 62 | ARROW_PREDICT_TRUE(condition)
May 20 14:13:58 | ^
May 20 14:13:58 /build/contrib/arrow/cpp/src/arrow/util/macros.h:49:31: note: expanded from macro 'ARROW_PREDICT_TRUE'
May 20 14:13:58 49 | #define ARROW_PREDICT_TRUE(x) (__builtin_expect(!!(x), 1))
May 20 14:13:58 | ^
May 20 14:13:58 /build/contrib/arrow/cpp/src/arrow/util/bit_stream_utils.h:320:3: note: Assuming the condition is false
May 20 14:13:58 320 | DCHECK_LE(num_bits, static_cast(sizeof(T) * 8)) << "num_bits: " << num_bits;
May 20 14:13:58 | ^
May 20 14:13:58 /build/contrib/arrow/cpp/src/arrow/util/logging.h:144:19: note: expanded from macro 'DCHECK_LE'
May 20 14:13:58 144 | #define DCHECK_LE ARROW_DCHECK_LE
May 20 14:13:58 | ^
May 20 14:13:58 /build/contrib/arrow/cpp/src/arrow/util/logging.h:133:25: note: expanded from macro 'ARROW_DCHECK_LE'
May 20 14:13:58 133 | #define ARROW_DCHECK_LE ARROW_CHECK_LE
May 20 14:13:58 | ^
May 20 14:13:58 /build/contrib/arrow/cpp/src/arrow/util/logging.h:84:48: note: expanded from macro 'ARROW_CHECK_LE'
May 20 14:13:58 84 | #define ARROW_CHECK_LE(val1, val2) ARROW_CHECK((val1) <= (val2))
May 20 14:13:58 | ~~~~~~~~~~~~^~~~~~~~~~~~~~~~~
May 20 14:13:58 /build/contrib/arrow/cpp/src/arrow/util/logging.h:66:51: note: expanded from macro 'ARROW_CHECK'
May 20 14:13:58 66 | #define ARROW_CHECK(condition) ARROW_CHECK_OR_LOG(condition, FATAL)
May 20 14:13:58 | ~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~
May 20 14:13:58 /build/contrib/arrow/cpp/src/arrow/util/logging.h:62:22: note: expanded from macro 'ARROW_CHECK_OR_LOG'
May 20 14:13:58 62 | ARROW_PREDICT_TRUE(condition)
May 20 14:13:58 | ~~~~~~~~~~~~~~~~~~~^~~~~~~~~~
May 20 14:13:58 /build/contrib/arrow/cpp/src/arrow/util/macros.h:49:52: note: expanded from macro 'ARROW_PREDICT_TRUE'
May 20 14:13:58 49 | #define ARROW_PREDICT_TRUE(x) (__builtin_expect(!!(x), 1))
May 20 14:13:58 | ^
May 20 14:13:58 /build/contrib/arrow/cpp/src/arrow/util/bit_stream_utils.h:320:3: note: '?' condition is false
May 20 14:13:58 320 | DCHECK_LE(num_bits, static_cast(sizeof(T) * 8)) << "num_bits: " << num_bits;
May 20 14:13:58 | ^
May 20 14:13:58 /build/contrib/arrow/cpp/src/arrow/util/logging.h:144:19: note: expanded from macro 'DCHECK_LE'
May 20 14:13:58 144 | #define DCHECK_LE ARROW_DCHECK_LE
May 20 14:13:58 | ^
May 20 14:13:58 /build/contrib/arrow/cpp/src/arrow/util/logging.h:133:25: note: expanded from macro 'ARROW_DCHECK_LE'
May 20 14:13:58 133 | #define ARROW_DCHECK_LE ARROW_CHECK_LE
May 20 14:13:58 | ^
May 20 14:13:58 /build/contrib/arrow/cpp/src/arrow/util/logging.h:84:36: note: expanded from macro 'ARROW_CHECK_LE'
May 20 14:13:58 84 | #define ARROW_CHECK_LE(val1, val2) ARROW_CHECK((val1) <= (val2))
May 20 14:13:58 | ^
May 20 14:13:58 /build/contrib/arrow/cpp/src/arrow/util/logging.h:66:32: note: expanded from macro 'ARROW_CHECK'
May 20 14:13:58 66 | #define ARROW_CHECK(condition) ARROW_CHECK_OR_LOG(condition, FATAL)
May 20 14:13:58 | ^
May 20 14:13:58 /build/contrib/arrow/cpp/src/arrow/util/logging.h:62:3: note: expanded from macro 'ARROW_CHECK_OR_LOG'
May 20 14:13:58 62 | ARROW_PREDICT_TRUE(condition)
May 20 14:13:58 | ^
May 20 14:13:58 /build/contrib/arrow/cpp/src/arrow/util/macros.h:49:31: note: expanded from macro 'ARROW_PREDICT_TRUE'
May 20 14:13:58 49 | #define ARROW_PREDICT_TRUE(x) (__builtin_expect(!!(x), 1))
May 20 14:13:58 | ^
May 20 14:13:58 /build/contrib/arrow/cpp/src/arrow/util/bit_stream_utils.h:332:7: note: Assuming 'remaining_bits' is >= 'needed_bits'
May 20 14:13:58 332 | if (remaining_bits < needed_bits) {
May 20 14:13:58 | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
May 20 14:13:58 /build/contrib/arrow/cpp/src/arrow/util/bit_stream_utils.h:332:3: note: Taking false branch
May 20 14:13:58 332 | if (remaining_bits < needed_bits) {
May 20 14:13:58 | ^
May 20 14:13:58 /build/contrib/arrow/cpp/src/arrow/util/bit_stream_utils.h:337:27: note: Assuming 'bit_offset' is not equal to 0
May 20 14:13:58 337 | if (ARROW_PREDICT_FALSE(bit_offset != 0)) {
May 20 14:13:58 | ^
May 20 14:13:58 /build/contrib/arrow/cpp/src/arrow/util/macros.h:48:53: note: expanded from macro 'ARROW_PREDICT_FALSE'
May 20 14:13:58 48 | #define ARROW_PREDICT_FALSE(x) (builtin_expect(!!(x), 0))
May 20 14:13:58 | ^
May 20 14:13:58 /build/contrib/arrow/cpp/src/arrow/util/bit_stream_utils.h:337:3: note: Taking true branch
May 20 14:13:58 337 | if (ARROW_PREDICT_FALSE(bit_offset != 0)) {
May 20 14:13:58 | ^
May 20 14:13:58 /build/contrib/arrow/cpp/src/arrow/util/bit_stream_utils.h:338:12: note: Assuming 'i' is < 'batch_size'
May 20 14:13:58 338 | for (; i < batch_size && bit_offset != 0; ++i) {
May 20 14:13:58 | ^~~~~~~~~~~~~~
May 20 14:13:58 /build/contrib/arrow/cpp/src/arrow/util/bit_stream_utils.h:338:12: note: Left side of '&&' is true
May 20 14:13:58 /build/contrib/arrow/cpp/src/arrow/util/bit_stream_utils.h:338:30: note: 'bit_offset' is not equal to 0
May 20 14:13:58 338 | for (; i < batch_size && bit_offset != 0; ++i) {
May 20 14:13:58 | ^~~~~~~~~~
May 20 14:13:58 /build/contrib/arrow/cpp/src/arrow/util/bit_stream_utils.h:338:5: note: Loop condition is true. Entering loop body
May 20 14:13:58 338 | for (; i < batch_size && bit_offset != 0; ++i) {
May 20 14:13:58 | ^
May 20 14:13:58 /build/contrib/arrow/cpp/src/arrow/util/bit_stream_utils.h:339:7: note: Calling 'GetValue
'
May 20 14:13:58 339 | detail::GetValue
(num_bits, &v[i], max_bytes, buffer, &bit_offset, &byte_offset,
May 20 14:13:58 | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
May 20 14:13:58 340 | &buffered_values);
May 20 14:13:58 | ~~~~~~~~~~~~~~~~~
May 20 14:13:58 /build/contrib/arrow/cpp/src/arrow/util/bit_stream_utils.h:280:23: note: Assuming right operand of bit shift is non-negative but less than 64
May 20 14:13:58 280 | *v = static_cast(bit_util::TrailingBits(*buffered_values, *bit_offset + num_bits) >>
May 20 14:13:58 | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
May 20 14:13:58 281 | *bit_offset);
May 20 14:13:58 | ~~~~~~~~~~~
May 20 14:13:58 /build/contrib/arrow/cpp/src/arrow/util/bit_stream_utils.h:286:7: note: Assuming the condition is true
May 20 14:13:58 286 | if (*bit_offset >= 64) {
May 20 14:13:58 | ^~~~~~~~~~~~~~~~~
May 20 14:13:58 /build/contrib/arrow/cpp/src/arrow/util/bit_stream_utils.h:286:3: note: Taking true branch
May 20 14:13:58 286 | if (*bit_offset >= 64) {
May 20 14:13:58 | ^
May 20 14:13:58 /build/contrib/arrow/cpp/src/arrow/util/bit_stream_utils.h:288:5: note: Value assigned to 'bit_offset'
May 20 14:13:58 288 | *bit_offset -= 64;
May 20 14:13:58 | ^~~~~~~~~~~~~~~~~
May 20 14:13:58 /build/contrib/arrow/cpp/src/arrow/util/bit_stream_utils.h:296:28: note: Assuming the condition is false
May 20 14:13:58 296 | if (ARROW_PREDICT_TRUE(num_bits - *bit_offset < static_cast(8 * sizeof(T)))) {
May 20 14:13:58 | ^
May 20 14:13:58 /build/contrib/arrow/cpp/src/arrow/util/macros.h:49:52: note: expanded from macro 'ARROW_PREDICT_TRUE'
May 20 14:13:58 49 | #define ARROW_PREDICT_TRUE(x) (__builtin_expect(!!(x), 1))
May 20 14:13:58 | ^
May 20 14:13:58 /build/contrib/arrow/cpp/src/arrow/util/bit_stream_utils.h:296:5: note: Taking false branch
May 20 14:13:58 296 | if (ARROW_PREDICT_TRUE(num_bits - *bit_offset < static_cast(8 * sizeof(T)))) {
May 20 14:13:58 | ^
May 20 14:13:58 /build/contrib/arrow/cpp/src/arrow/util/bit_stream_utils.h:306:5: note: Assuming the condition is false
May 20 14:13:58 306 | DCHECK_LE(*bit_offset, 64);
May 20 14:13:58 | ^
May 20 14:13:58 /build/contrib/arrow/cpp/src/arrow/util/logging.h:144:19: note: expanded from macro 'DCHECK_LE'
May 20 14:13:58 144 | #define DCHECK_LE ARROW_DCHECK_LE
May 20 14:13:58 | ^
May 20 14:13:58 /build/contrib/arrow/cpp/src/arrow/util/logging.h:133:25: note: expanded from macro 'ARROW_DCHECK_LE'
May 20 14:13:58 133 | #define ARROW_DCHECK_LE ARROW_CHECK_LE
May 20 14:13:58 | ^
May 20 14:13:58 /build/contrib/arrow/cpp/src/arrow/util/logging.h:84:48: note: expanded from macro 'ARROW_CHECK_LE'
May 20 14:13:58 84 | #define ARROW_CHECK_LE(val1, val2) ARROW_CHECK((val1) <= (val2))
May 20 14:13:58 | ~~~~~~~~~~~~^~~~~~~~~~~~~~~~~
May 20 14:13:58 /build/contrib/arrow/cpp/src/arrow/util/logging.h:66:51: note: expanded from macro 'ARROW_CHECK'
May 20 14:13:58 66 | #define ARROW_CHECK(condition) ARROW_CHECK_OR_LOG(condition, FATAL)
May 20 14:13:58 | ~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~
May 20 14:13:58 /build/contrib/arrow/cpp/src/arrow/util/logging.h:62:22: note: expanded from macro 'ARROW_CHECK_OR_LOG'
May 20 14:13:58 62 | ARROW_PREDICT_TRUE(condition)
May 20 14:13:58 | ~~~~~~~~~~~~~~~~~~~^~~~~~~~~~
May 20 14:13:58 /build/contrib/arrow/cpp/src/arrow/util/macros.h:49:52: note: expanded from macro 'ARROW_PREDICT_TRUE'
May 20 14:13:58 49 | #define ARROW_PREDICT_TRUE(x) (builtin_expect(!!(x), 1))
May 20 14:13:58 | ^
May 20 14:13:58 /build/contrib/arrow/cpp/src/arrow/util/bit_stream_utils.h:306:5: note: '?' condition is false
May 20 14:13:58 306 | DCHECK_LE(*bit_offset, 64);
May 20 14:13:58 | ^
May 20 14:13:58 /build/contrib/arrow/cpp/src/arrow/util/logging.h:144:19: note: expanded from macro 'DCHECK_LE'
May 20 14:13:58 144 | #define DCHECK_LE ARROW_DCHECK_LE
May 20 14:13:58 | ^
May 20 14:13:58 /build/contrib/arrow/cpp/src/arrow/util/logging.h:133:25: note: expanded from macro 'ARROW_DCHECK_LE'
May 20 14:13:58 133 | #define ARROW_DCHECK_LE ARROW_CHECK_LE
May 20 14:13:58 | ^
May 20 14:13:58 /build/contrib/arrow/cpp/src/arrow/util/logging.h:84:36: note: expanded from macro 'ARROW_CHECK_LE'
May 20 14:13:58 84 | #define ARROW_CHECK_LE(val1, val2) ARROW_CHECK((val1) <= (val2))
May 20 14:13:58 | ^
May 20 14:13:58 /build/contrib/arrow/cpp/src/arrow/util/logging.h:66:32: note: expanded from macro 'ARROW_CHECK'
May 20 14:13:58 66 | #define ARROW_CHECK(condition) ARROW_CHECK_OR_LOG(condition, FATAL)
May 20 14:13:58 | ^
May 20 14:13:58 /build/contrib/arrow/cpp/src/arrow/util/logging.h:62:3: note: expanded from macro 'ARROW_CHECK_OR_LOG'
May 20 14:13:58 62 | ARROW_PREDICT_TRUE(condition)
May 20 14:13:58 | ^
May 20 14:13:58 /build/contrib/arrow/cpp/src/arrow/util/macros.h:49:31: note: expanded from macro 'ARROW_PREDICT_TRUE'
May 20 14:13:58 49 | #define ARROW_PREDICT_TRUE(x) (builtin_expect(!!(x), 1))
May 20 14:13:58 | ^
May 20 14:13:58 /build/contrib/arrow/cpp/src/arrow/util/bit_stream_utils.h:339:7: note: Returning from 'GetValue
'
May 20 14:13:58 339 | detail::GetValue
(num_bits, &v[i], max_bytes, buffer, &bit_offset, &byte_offset,
May 20 14:13:58 | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
May 20 14:13:58 340 | &buffered_values);
May 20 14:13:58 | ~~~~~~~~~~~~~~~~~
May 20 14:13:58 /build/contrib/arrow/cpp/src/arrow/util/bit_stream_utils.h:338:12: note: Assuming 'i' is < 'batch_size'
May 20 14:13:58 338 | for (; i < batch_size && bit_offset != 0; ++i) {
May 20 14:13:58 | ^~~~~~~~~~~~~~
May 20 14:13:58 /build/contrib/arrow/cpp/src/arrow/util/bit_stream_utils.h:338:12: note: Left side of '&&' is true
May 20 14:13:58 /build/contrib/arrow/cpp/src/arrow/util/bit_stream_utils.h:338:30: note: 'bit_offset' is not equal to 0
May 20 14:13:58 338 | for (; i < batch_size && bit_offset != 0; ++i) {
May 20 14:13:58 | ^~~~~~~~~~
May 20 14:13:58 /build/contrib/arrow/cpp/src/arrow/util/bit_stream_utils.h:338:5: note: Loop condition is true. Entering loop body
May 20 14:13:58 338 | for (; i < batch_size && bit_offset != 0; ++i) {
May 20 14:13:58 | ^
May 20 14:13:58 /build/contrib/arrow/cpp/src/arrow/util/bit_stream_utils.h:339:7: note: Calling 'GetValue
'
May 20 14:13:58 339 | detail::GetValue(num_bits, &v[i], max_bytes, buffer, &bit_offset, &byte_offset,
May 20 14:13:58 | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
May 20 14:13:58 340 | &buffered_values);
May 20 14:13:58 | ~~~~~~~~~~~~~~~~~
May 20 14:13:58 /build/contrib/arrow/cpp/src/arrow/util/bit_stream_utils.h:280:88: note: The result of right shift is undefined because the right operand is not smaller than 64, the capacity of 'uint64_t'
May 20 14:13:58 280 | *v = static_cast(bit_util::TrailingBits(*buffered_values, *bit_offset + num_bits) >>
May 20 14:13:58 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~
May 20 14:13:58 281 | bit_offset);
May 20 14:13:58 | ~~~~~~~~~~~
May 20 14:13:58 19235 warnings generated.
May 20 14:13:58 Suppressed 19563 warnings (19234 in non-user code, 329 NOLINT).
May 20 14:13:58 Use -header-filter=.
to display errors from all non-system headers. Use -system-headers to display errors from system headers as well.
May 20 14:13:58 1 warning treated as error

What changes are included in this PR?

Same as title

Are these changes tested?

Manually

Are there any user-facing changes?

No

Change-Id: I4bbd75fcf848f0686a7c256e761802d754ebcb95
Copy link

Thanks for opening a pull request!

If this is not a minor PR. Could you open an issue for this pull request on GitHub? https://github.com/apache/arrow/issues/new/choose

Opening GitHub issues ahead of time contributes to the Openness of the Apache Arrow project.

Then could you also rename the pull request title in the following format?

GH-${GITHUB_ISSUE_ID}: [${COMPONENT}] ${SUMMARY}

or

MINOR: [${COMPONENT}] ${SUMMARY}

In the case of PARQUET issues on JIRA the title also supports:

PARQUET-${JIRA_ISSUE_ID}: [${COMPONENT}] ${SUMMARY}

See also:

@al13n321 al13n321 merged commit 5cfccd8 into ClickHouse:release-13.0.0 May 21, 2024
6 of 33 checks passed
nikitamikhaylov pushed a commit that referenced this pull request Oct 15, 2024
Check validation of of bit offset when reading bit packed values

(cherry picked from commit 5cfccd8)
nikitamikhaylov added a commit that referenced this pull request Oct 16, 2024
* Empty commit

* Merge pull request #66 from copperybean/release-13.0.0

Check validation of of bit offset when reading bit packed values

(cherry picked from commit 5cfccd8)

* Merge pull request #47 from ClickHouse/fix-uninit-value-msan

Fix possible use-of-uninitizliaed-value

(cherry picked from commit ba5c679)

* Merge pull request #39 from ClickHouse/count-from-record-batch

Allow to get number of rows in record batch

(cherry picked from commit 1d93838)

* Merge pull request #9 from taiyang-li/raw_orc_reader

Add interface to get raw orc reader from adapters

(cherry picked from commit ce6b7af)

* Merge pull request #10 from taiyang-li/fix_pr_9

fix building issue introduced by https://github.com/ClickHouse-Extras…

(cherry picked from commit 20dc6ad)

* Merge pull request #14 from ClickHouse/fix-deadlock

Fix deadlock with msan

(cherry picked from commit b41ff44)

* Merge pull request #15 from ClickHouse/try-fix-data-race

Fix 'undefined symbol: pthread_atfork' on PowerPC64

(cherry picked from commit 450a563)

* Merge pull request #16 from ClickHouse/remove-abort-in-logging

Don't abort in ~CerrLog

(cherry picked from commit d03245f)

* Merge pull request #17 from bigo-sg/allow_map_key_optional

Allow Parquet map key to be optional

(cherry picked from commit 0d6d07f)

* Fix build

(cherry picked from commit 3264fda)

---------

Co-authored-by: Michael Kolupaev <[email protected]>
Co-authored-by: Kruglov Pavel <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants