-
Notifications
You must be signed in to change notification settings - Fork 447
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GLUTEN-2163][CH] support aggregate function approx_percentile #4829
Conversation
Thanks for opening a pull request! Could you open an issue for this pull request on Github Issues? https://github.com/oap-project/gluten/issues Then could you also rename commit message and pull request title in the following format?
See also: |
Run Gluten Clickhouse CI |
3 similar comments
Run Gluten Clickhouse CI |
Run Gluten Clickhouse CI |
Run Gluten Clickhouse CI |
Notice: This bugfix(ClickHouse/ClickHouse#60740) must be merged first otherwise we have the following issue: 0: jdbc:hive2://localhost:10000/>
0: jdbc:hive2://localhost:10000/>
0: jdbc:hive2://localhost:10000/> CREATE TEMPORARY VIEW lineitem
. . . . . . . . . . . . . . . . > USING org.apache.spark.sql.parquet
. . . . . . . . . . . . . . . . > OPTIONS (
. . . . . . . . . . . . . . . . > path "/data1/liyang/cppproject/gluten/gluten-core/src/test/resources/tpch-data/lineitem"
. . . . . . . . . . . . . . . . > ) ;
+---------+
| Result |
+---------+
+---------+
No rows selected (3.193 seconds)
0: jdbc:hive2://localhost:10000/> select l_linenumber % 10, approx_percentile(l_extendedprice, 0.5) from lineitem group by l_linenumber % 10;
Error: org.apache.hive.service.cli.HiveSQLException: Error running query: org.apache.spark.SparkException: Job aborted due to stage failure: Task 11 in stage 2.0 failed 1 times, most recent failure: Lost task 11.0 in stage 2.0 (TID 6) (bigo executor driver): io.glutenproject.exception.GlutenException: io.glutenproject.exception.GlutenException: The number of elements 11869 for quantileGK exceeds 10000
0. ./contrib/llvm-project/libcxx/include/exception:141: Poco::Exception::Exception(String const&, int) @ 0x0000000012067b9d in /data1/liyang/cppproject/kyli/ClickHouse/build_gcc/utils/extern-local-engine/libch.so
1. ./build_gcc/./src/Common/Exception.cpp:96: DB::Exception::Exception(DB::Exception::MessageMasked&&, int, bool) @ 0x000000000b03b0bf in /data1/liyang/cppproject/kyli/ClickHouse/build_gcc/utils/extern-local-engine/libch.so
2. ./contrib/llvm-project/libcxx/include/string:1499: DB::Exception::Exception<unsigned long&, unsigned long&>(int, FormatStringHelperImpl<std::type_identity<unsigned long&>::type, std::type_identity<unsigned long&>::type>, unsigned long&, unsigned long&) @ 0x0000000006ccf93e in /data1/liyang/cppproject/kyli/ClickHouse/build_gcc/utils/extern-local-engine/libch.so
3. ./build_gcc/./src/AggregateFunctions/AggregateFunctionQuantileGK.cpp:0: DB::(anonymous namespace)::QuantileGK<double>::deserialize(DB::ReadBuffer&) @ 0x000000000bfb0fac in /data1/liyang/cppproject/kyli/ClickHouse/build_gcc/utils/extern-local-engine/libch.so
4. ./src/AggregateFunctions/AggregateFunctionQuantile.h:233: DB::AggregateFunctionQuantile<double, DB::(anonymous namespace)::QuantileGK<double>, DB::NameQuantileGK, false, void, false, true>::deserialize(char*, DB::ReadBuffer&, std::optional<unsigned long>, DB::Arena*) const @ 0x000000000bfad96f in /data1/liyang/cppproject/kyli/ClickHouse/build_gcc/utils/extern-local-engine/libch.so
5. ./src/AggregateFunctions/Combinators/AggregateFunctionNull.h:177: DB::AggregateFunctionNullBase<true, true, DB::AggregateFunctionNullUnary<true, true>>::deserialize(char*, DB::ReadBuffer&, std::optional<unsigned long>, DB::Arena*) const @ 0x000000000d29dde6 in /data1/liyang/cppproject/kyli/ClickHouse/build_gcc/utils/extern-local-engine/libch.so
6. ./build_gcc/./src/DataTypes/Serializations/SerializationAggregateFunction.cpp:0: DB::SerializationAggregateFunction::deserializeBinaryBulk(DB::IColumn&, DB::ReadBuffer&, unsigned long, double) const @ 0x000000000e6ce44e in /data1/liyang/cppproject/kyli/ClickHouse/build_gcc/utils/extern-local-engine/libch.so
7. ./contrib/boost/boost/smart_ptr/intrusive_ptr.hpp:211: DB::ISerialization::deserializeBinaryBulkWithMultipleStreams(COW<DB::IColumn>::immutable_ptr<DB::IColumn>&, unsigned long, DB::ISerialization::DeserializeBinaryBulkSettings&, std::shared_ptr<DB::ISerialization::DeserializeBinaryBulkState>&, std::unordered_map<String, COW<DB::IColumn>::immutable_ptr<DB::IColumn>, std::hash<String>, std::equal_to<String>, std::allocator<std::pair<String const, COW<DB::IColumn>::immutable_ptr<DB::IColumn>>>>*) const @ 0x000000000e6c9656 in /data1/liyang/cppproject/kyli/ClickHouse/build_gcc/utils/extern-local-engine/libch.so
8. ./contrib/boost/boost/smart_ptr/intrusive_ptr.hpp:202: local_engine::readNormalComplexData(DB::ReadBuffer&, COW<DB::IColumn>::immutable_ptr<DB::IColumn>&, unsigned long, local_engine::NativeReader::ColumnParseUtil&) @ 0x000000000b37fb57 in /data1/liyang/cppproject/kyli/ClickHouse/build_gcc/utils/extern-local-engine/libch.so
9. ./contrib/llvm-project/libcxx/include/__utility/swap.h:37: local_engine::NativeReader::prepareByFirstBlock() @ 0x000000000b37e6ca in /data1/liyang/cppproject/kyli/ClickHouse/build_gcc/utils/extern-local-engine/libch.so
10. ./build_gcc/./utils/extern-local-engine/Storages/IO/NativeReader.cpp:0: local_engine::NativeReader::read() @ 0x000000000b37d115 in /data1/liyang/cppproject/kyli/ClickHouse/build_gcc/utils/extern-local-engine/libch.so
11. ./build_gcc/./utils/extern-local-engine/Shuffle/ShuffleReader.cpp:52: local_engine::ShuffleReader::read() @ 0x000000000b4369a7 in /data1/liyang/cppproject/kyli/ClickHouse/build_gcc/utils/extern-local-engine/libch.so
12. ./build_gcc/./utils/extern-local-engine/local_engine_jni.cpp:587: Java_io_glutenproject_vectorized_CHStreamReader_nativeNext @ 0x0000000005eed2bb in /data1/liyang/cppproject/kyli/ClickHouse/build_gcc/utils/extern-local-engine/libch.so
0. ./contrib/llvm-project/libcxx/include/exception:141: Poco::Exception::Exception(String const&, int) @ 0x0000000012067b9d in /data1/liyang/cppproject/kyli/ClickHouse/build_gcc/utils/extern-local-engine/libch.so
1. ./build_gcc/./src/Common/Exception.cpp:96: DB::Exception::Exception(DB::Exception::MessageMasked&&, int, bool) @ 0x000000000b03b0bf in /data1/liyang/cppproject/kyli/ClickHouse/build_gcc/utils/extern-local-engine/libch.so
2. ./contrib/llvm-project/libcxx/include/string:1499: DB::Exception::createRuntime(int, String&) @ 0x0000000005efccd2 in /data1/liyang/cppproject/kyli/ClickHouse/build_gcc/utils/extern-local-engine/libch.so
3. ./utils/extern-local-engine/jni/jni_common.h:79: unsigned char local_engine::safeCallBooleanMethod<>(JNIEnv_*, _jobject*, _jmethodID*) @ 0x0000000005efdd70 in /data1/liyang/cppproject/kyli/ClickHouse/build_gcc/utils/extern-local-engine/libch.so
4. ./build_gcc/./utils/extern-local-engine/Storages/SourceFromJavaIter.cpp:55: local_engine::SourceFromJavaIter::peekBlock(JNIEnv_*, _jobject*) @ 0x000000000b36a3dd in /data1/liyang/cppproject/kyli/ClickHouse/build_gcc/utils/extern-local-engine/libch.so
5. ./build_gcc/./utils/extern-local-engine/Parser/SerializedPlanParser.cpp:318: local_engine::SerializedPlanParser::parseReadRealWithJavaIter(substrait::ReadRel const&) @ 0x000000000b31578a in /data1/liyang/cppproject/kyli/ClickHouse/build_gcc/utils/extern-local-engine/libch.so
6. ./build_gcc/./utils/extern-local-engine/Parser/SerializedPlanParser.cpp:0: local_engine::SerializedPlanParser::parseOp(substrait::Rel const&, std::list<substrait::Rel const*, std::allocator<substrait::Rel const*>>&) @ 0x000000000b31b3f9 in /data1/liyang/cppproject/kyli/ClickHouse/build_gcc/utils/extern-local-engine/libch.so
7. ./build_gcc/./utils/extern-local-engine/Parser/RelParser.cpp:70: local_engine::RelParser::parseOp(substrait::Rel const&, std::list<substrait::Rel const*, std::allocator<substrait::Rel const*>>&) @ 0x000000000b2d3a39 in /data1/liyang/cppproject/kyli/ClickHouse/build_gcc/utils/extern-local-engine/libch.so
8. ./contrib/llvm-project/libcxx/include/__memory/unique_ptr.h:303: local_engine::SerializedPlanParser::parseOp(substrait::Rel const&, std::list<substrait::Rel const*, std::allocator<substrait::Rel const*>>&) @ 0x000000000b31a71d in /data1/liyang/cppproject/kyli/ClickHouse/build_gcc/utils/extern-local-engine/libch.so
9. ./build_gcc/./utils/extern-local-engine/Parser/SerializedPlanParser.cpp:398: local_engine::SerializedPlanParser::parse(std::unique_ptr<substrait::Plan, std::default_delete<substrait::Plan>>) @ 0x000000000b3193e7 in /data1/liyang/cppproject/kyli/ClickHouse/build_gcc/utils/extern-local-engine/libch.so
10. ./build_gcc/./utils/extern-local-engine/Parser/SerializedPlanParser.cpp:1790: local_engine::SerializedPlanParser::parse(String const&) @ 0x000000000b327dc5 in /data1/liyang/cppproject/kyli/ClickHouse/build_gcc/utils/extern-local-engine/libch.so
11. ./build_gcc/./utils/extern-local-engine/local_engine_jni.cpp:277: Java_io_glutenproject_vectorized_ExpressionEvaluatorJniWrapper_nativeCreateKernelWithIterator @ 0x0000000005ee75a0 in /data1/liyang/cppproject/kyli/ClickHouse/build_gcc/utils/extern-local-engine/libch.so
at io.glutenproject.vectorized.ExpressionEvaluatorJniWrapper.nativeCreateKernelWithIterator(Native Method)
at io.glutenproject.vectorized.CHNativeExpressionEvaluator.createKernelWithBatchIterator(CHNativeExpressionEvaluator.java:93)
at io.glutenproject.backendsapi.clickhouse.CHIteratorApi.genFinalStageIterator(CHIteratorApi.scala:265)
at io.glutenproject.execution.WholeStageZippedPartitionsRDD.$anonfun$compute$1(WholeStageZippedPartitionsRDD.scala:58)
at io.glutenproject.utils.Arm$.withResource(Arm.scala:25)
at io.glutenproject.metrics.GlutenTimeMetric$.millis(GlutenTimeMetric.scala:37)
at io.glutenproject.execution.WholeStageZippedPartitionsRDD.compute(WholeStageZippedPartitionsRDD.scala:46)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:365)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:329)
at org.apache.spark.sql.execution.CHColumnarToRowRDD.compute(CHColumnarToRowExec.scala:92)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:365)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:329)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:365)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:329)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:136)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:548)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1504)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:551)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750) |
7773ecf
to
0e9e9ca
Compare
Run Gluten Clickhouse CI |
0e9e9ca
to
2a92e73
Compare
Run Gluten Clickhouse CI |
1 similar comment
Run Gluten Clickhouse CI |
bf046d9
to
3fbd99a
Compare
Run Gluten Clickhouse CI |
1 similar comment
Run Gluten Clickhouse CI |
Run Gluten Clickhouse CI |
A velox ut failed, do you know why? @rui-mo it is strange because |
@zzcclp @liuneng1994 could you help review this pr, thanks very much ! |
@taiyang-li I assume it passes the validation here https://github.com/apache/incubator-gluten/blob/main/backends-velox/src/main/scala/io/glutenproject/backendsapi/velox/VeloxBackend.scala#L326-L327 because |
Run Gluten Clickhouse CI |
backends-velox/src/main/scala/io/glutenproject/backendsapi/velox/VeloxBackend.scala
Outdated
Show resolved
Hide resolved
Run Gluten Clickhouse CI |
1 similar comment
Run Gluten Clickhouse CI |
velox build failed cc @rui-mo |
30c5cc1
to
58e8a02
Compare
Run Gluten Clickhouse CI |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you rebase this PR? Thanks.
58e8a02
to
2923096
Compare
Run Gluten Clickhouse CI |
LGTM, @rui-mo could you help to review this pr, thanks. |
backends-velox/src/main/scala/io/glutenproject/backendsapi/velox/VeloxBackend.scala
Show resolved
Hide resolved
2923096
to
3deab1f
Compare
Run Gluten Clickhouse CI |
What changes were proposed in this pull request?
(Please fill in changes proposed in this fix)
(Fixes: #2163)
vanilla takes 2.665 seconds
gluten takes 1.502 seconds