Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Master autorun workflow #44

Open
wants to merge 7,753 commits into
base: master
Choose a base branch
from
Open

Master autorun workflow #44

wants to merge 7,753 commits into from

Conversation

CalvinKirs
Copy link
Owner

Proposed changes

Issue Number: close #xxx

Further comments

If this is a relatively large or complex change, kick off the discussion at [email protected] by explaining why you chose the solution you did and what alternatives you considered, etc...

Gabriel39 and others added 30 commits October 21, 2024 14:07
…hed jobs (apache#42150)

## Proposed changes

Fix case failure like below
```
Exception in inverted_index_p0/test_build_index.groovy(line 185):

    wait_for_latest_op_on_table_finish(tableName, timeout)

    // BUILD INDEX and expect state is RUNNING
    sql """ BUILD INDEX idx_comment ON ${tableName} """
    def state = wait_for_last_build_index_on_table_running(tableName, timeout)
    if (state != "SKIPPED") {
        def result = sql """ SHOW BUILD INDEX WHERE TableName = "${tableName}" ORDER BY JobId """
        assertEquals(result[result.size()-1][1], tableName)
        assertTrue(result[result.size()-1][3].contains("ADD INDEX"))
        assertEquals(result[result.size()-1][7], "RUNNING")
^^^^^^^^^^^^^^^^^^^^^^^^^^ERROR LINE^^^^^^^^^^^^^^^^^^^^^^^^^^

        // CANCEL BUILD INDEX and expect state is CANCELED
        sql """ CANCEL BUILD INDEX ON ${tableName} (${result[result.size()-1][0]}) """
        result = sql """ SHOW BUILD INDEX WHERE TableName = "${tableName}" ORDER BY JobId """
        assertEquals(result[result.size()-1][1], tableName)
        assertTrue(result[result.size()-1][3].contains("ADD INDEX"))
        assertEquals(result[result.size()-1][7], "CANCELLED")
        assertEquals(result[result.size()-1][8], "user cancelled")

        // BUILD INDEX and expect state is FINISHED

Exception:
org.opentest4j.AssertionFailedError: expected: <FINISHED> but was: <RUNNING>
```
)

filter_generic in column_array is wrong calculate in
nested_result_hint_size which may cause heap_buffer_overflow
… call (apache#41739)

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->

---------

Co-authored-by: zhiqiang-hhhh <[email protected]>
…_in (apache#41681)

trim_in is different from trim
Find and remove any characters in a set of characters at both ends of a
string (regardless of order)

mysql> SELECT TRIM('abcd', 'cde');
+---------------------+
| trim('abcd', 'cde') |
+---------------------+
| abcd                |
+---------------------+
1 row in set (0.02 sec)

mysql> SELECT TRIM_IN('abcd', 'cde');
+------------------------+
| trim_in('abcd', 'cde') |
+------------------------+
| ab                     |
+------------------------+
1 row in set (0.02 sec)
### What problem does this PR solve?

Problem Summary:
1. `test_jdbc_catalog_ddl` use same catalog name with
`test_mysql_jdbc5_catalog`, which cause concurrency issue.
2. comment out `paimon_base_filesystem` test because cos env is not
ready

- Release note

    None
Because `test_trino_hive_tpch_sf1_parquet` case may cause BE JVM OOM, we
increase JVM memory of BE.
…les` (apache#42188)

We should add `order by col1` becuase there are multiple repeated col0 rows.
…te express is not slot (apache#42113)

## Proposed changes
follow up apache#35335
When the `"case when ... then ... when ... then ... else"` occurs,
function_expr may not exist in the pushed down predicate, but the
handling of null values ​​is still problematic.

table data:
```text
mysql> select o_orderpriority from test_string_dict_filter_orc;
+-----------------+
| o_orderpriority |
+-----------------+
| 5-LOW           |
| 1-URGENT        |
| 5-LOW           |
| NULL            |
| 5-LOW           |
+-----------------+
```

before:
```text
mysql> select count(o_orderpriority) from ( select (case when o_orderpriority = 'x' then '1' when o_orderpriority = 'y' then '2' else '0' end) as o_orderpriority from test_string_dict_filter_orc ) as A where o_orderpriority = '0';
+------------------------+
| count(o_orderpriority) |
+------------------------+
|                      4 |
+------------------------+
```

after:
```text
mysql> select count(o_orderpriority) from ( select (case when o_orderpriority = 'x' then '1' when o_orderpriority = 'y' then '2' else '0' end) as o_orderpriority from test_string_dict_filter_orc ) as A where o_orderpriority = '0';
+------------------------+
| count(o_orderpriority) |
+------------------------+
|                      5 |
+------------------------+
```
…es (apache#38227)

# Catalog DDL
```
Doris > CREATE CATALOG `oracle` PROPERTIES (
    -> "user" = "doris_test",
    -> "type" = "jdbc",
    -> "password" = "xxx",
    -> "jdbc_url" = "jdbc:oracle:thin:@xxx:1521:XE",
    -> "driver_url" = "ojdbc8-19.3.0.0.jar",
    -> "driver_class" = "oracle.jdbc.driver.OracleDriver"
    -> ); 
Query OK, 0 rows affected (2.16 sec)
```

# FE Conf lower_case_table_names = 1

```
Doris > show variables like '%lower_case%';
+------------------------+-------+---------------+---------+
| Variable_name          | Value | Default_Value | Changed |
+------------------------+-------+---------------+---------+
| lower_case_table_names | 1     | 0             | 1       |
+------------------------+-------+---------------+---------+
1 row in set (0.00 sec)

Doris > show tables from oracle.DORIS_TEST;
+----------------------+
| Tables_in_DORIS_TEST |
+----------------------+
| aa/d                 |
| aaad                 |
| lower_test           |
| student              |
| student2             |
| student3             |
| test_all_types       |
| test_char            |
| test_clob            |
| test_date            |
| test_insert          |
| test_int             |
| test_num             |
| test_number          |
| test_number2         |
| test_number3         |
| test_number4         |
| test_raw             |
| test_timestamp       |
+----------------------+
19 rows in set (0.01 sec)

```

## Before:

```
Doris > select * from oracle.DORIS_TEST.test_int limit 1;
+------+----------------+-----------------+------------+--------------------+----------------+-----------------+------------+---------------------+
| ID   | TINYINT_VALUE1 | SMALLINT_VALUE1 | INT_VALUE1 | BIGINT_VALUE1      | TINYINT_VALUE2 | SMALLINT_VALUE2 | INT_VALUE2 | BIGINT_VALUE2       |
+------+----------------+-----------------+------------+--------------------+----------------+-----------------+------------+---------------------+
| 1    |             99 |            9999 |  999999999 | 999999999999999999 |            999 |           99999 | 9999999999 | 9999999999999999999 |
+------+----------------+-----------------+------------+--------------------+----------------+-----------------+------------+---------------------+
1 row in set (1.03 sec)

Doris > select * from oracle.DORIS_TEST.TEST_INT limit 1;
ERROR 1105 (HY000): errCode = 2, detailMessage = Table [TEST_INT] does not exist in database [DORIS_TEST].

```
## After:

```
Doris > select * from oracle.DORIS_TEST.test_int limit 1;
+------+----------------+-----------------+------------+--------------------+----------------+-----------------+------------+---------------------+
| ID   | TINYINT_VALUE1 | SMALLINT_VALUE1 | INT_VALUE1 | BIGINT_VALUE1      | TINYINT_VALUE2 | SMALLINT_VALUE2 | INT_VALUE2 | BIGINT_VALUE2       |
+------+----------------+-----------------+------------+--------------------+----------------+-----------------+------------+---------------------+
| 1    |             99 |            9999 |  999999999 | 999999999999999999 |            999 |           99999 | 9999999999 | 9999999999999999999 |
+------+----------------+-----------------+------------+--------------------+----------------+-----------------+------------+---------------------+
1 row in set (0.20 sec)

Doris > select * from oracle.DORIS_TEST.TEST_INT limit 1;
+------+----------------+-----------------+------------+--------------------+----------------+-----------------+------------+---------------------+
| ID   | TINYINT_VALUE1 | SMALLINT_VALUE1 | INT_VALUE1 | BIGINT_VALUE1      | TINYINT_VALUE2 | SMALLINT_VALUE2 | INT_VALUE2 | BIGINT_VALUE2       |
+------+----------------+-----------------+------------+--------------------+----------------+-----------------+------------+---------------------+
| 1    |             99 |            9999 |  999999999 | 999999999999999999 |            999 |           99999 | 9999999999 | 9999999999999999999 |
+------+----------------+-----------------+------------+--------------------+----------------+-----------------+------------+---------------------+
1 row in set (0.20 sec)
```


# FE Conf lower_case_table_names = 2

```
Doris > show variables like '%lower_case%';
+------------------------+-------+---------------+---------+
| Variable_name          | Value | Default_Value | Changed |
+------------------------+-------+---------------+---------+
| lower_case_table_names | 2     | 0             | 1       |
+------------------------+-------+---------------+---------+
1 row in set (0.01 sec)

Doris > show tables from oracle.DORIS_TEST;
+----------------------+
| Tables_in_DORIS_TEST |
+----------------------+
| AA/D                 |
| AAAD                 |
| LOWER_TEST           |
| STUDENT              |
| TEST_ALL_TYPES       |
| TEST_CHAR            |
| TEST_CLOB            |
| TEST_DATE            |
| TEST_INSERT          |
| TEST_INT             |
| TEST_NUM             |
| TEST_NUMBER          |
| TEST_NUMBER2         |
| TEST_NUMBER3         |
| TEST_NUMBER4         |
| TEST_RAW             |
| TEST_TIMESTAMP       |
| student2             |
| student3             |
+----------------------+
19 rows in set (1.05 sec)
```

## Before:

```
Doris > select * from oracle.DORIS_TEST.test_int limit 1;
ERROR 1105 (HY000): errCode = 2, detailMessage = Table [test_int] does not exist in database [DORIS_TEST].
Doris > select * from oracle.DORIS_TEST.TEST_INT limit 1;
+------+----------------+-----------------+------------+--------------------+----------------+-----------------+------------+---------------------+
| ID   | TINYINT_VALUE1 | SMALLINT_VALUE1 | INT_VALUE1 | BIGINT_VALUE1      | TINYINT_VALUE2 | SMALLINT_VALUE2 | INT_VALUE2 | BIGINT_VALUE2       |
+------+----------------+-----------------+------------+--------------------+----------------+-----------------+------------+---------------------+
| 1    |             99 |            9999 |  999999999 | 999999999999999999 |            999 |           99999 | 9999999999 | 9999999999999999999 |
+------+----------------+-----------------+------------+--------------------+----------------+-----------------+------------+---------------------+
1 row in set (1.07 sec)
```

## After:

```
Doris > select * from oracle.DORIS_TEST.test_int limit 1;
+------+----------------+-----------------+------------+--------------------+----------------+-----------------+------------+---------------------+
| ID   | TINYINT_VALUE1 | SMALLINT_VALUE1 | INT_VALUE1 | BIGINT_VALUE1      | TINYINT_VALUE2 | SMALLINT_VALUE2 | INT_VALUE2 | BIGINT_VALUE2       |
+------+----------------+-----------------+------------+--------------------+----------------+-----------------+------------+---------------------+
| 1    |             99 |            9999 |  999999999 | 999999999999999999 |            999 |           99999 | 9999999999 | 9999999999999999999 |
+------+----------------+-----------------+------------+--------------------+----------------+-----------------+------------+---------------------+
1 row in set (0.21 sec)

Doris > select * from oracle.DORIS_TEST.TEST_INT limit 1;
+------+----------------+-----------------+------------+--------------------+----------------+-----------------+------------+---------------------+
| ID   | TINYINT_VALUE1 | SMALLINT_VALUE1 | INT_VALUE1 | BIGINT_VALUE1      | TINYINT_VALUE2 | SMALLINT_VALUE2 | INT_VALUE2 | BIGINT_VALUE2       |
+------+----------------+-----------------+------------+--------------------+----------------+-----------------+------------+---------------------+
| 1    |             99 |            9999 |  999999999 | 999999999999999999 |            999 |           99999 | 9999999999 | 9999999999999999999 |
+------+----------------+-----------------+------------+--------------------+----------------+-----------------+------------+---------------------+
1 row in set (0.20 sec)
```
…ile` session var to skip checking acid version file in some hive envs. (apache#42111)

## Proposed changes

[Configuration] (transactional-hive) Add
`skip_checking_acid_version_file` session var to skip checking acid
version file in some hive envs.
…che#42189)

## Proposed changes

We have already turned on this setting for the P0 regression cases, and
after observing for some time, I believe we can make this switch enabled
by default.
…rgument. (apache#41008)

## Proposed changes

```
mysql> select cast(4562632 as time),hour(cast(4562632 as time)), minute(cast(4562632 as time)),second(cast(4562632 as time));
+-----------------------+-----------------------------+-------------------------------+-------------------------------+
| cast(4562632 as TIME) | hour(cast(4562632 as TIME)) | minute(cast(4562632 as TIME)) | second(cast(4562632 as TIME)) |
+-----------------------+-----------------------------+-------------------------------+-------------------------------+
| 456:26:32             |                         456 |                            26 |                            32 |
+-----------------------+-----------------------------+-------------------------------+-------------------------------+
```

<!--Describe your changes.-->

---------

Co-authored-by: Dongyang Li <[email protected]>
…pache#41772)

## Proposed changes

This PR addresses an issue where counting on an index with a projected
expression was suboptimal.
…pache#42215)

Earlier, we turned on the -Wno-missing-field-initializers and
-Wno-unused-const-variable switches, now choose to disable them

same as apache#37403
…pache#42064)

## Proposed changes

In the current backend implementation, it is not correctly handle counts
when `is null` predicate exists, so we forbid the storage layer's count
currently.

for example:
```
select count(b) from test where b is null 
```

When apply the is null filter, the result array will fill the default
value of the column type and the count operator can not detect whether
the result is null, so the count operator compute the wrong result.
reset_range enlarges the size if the download size is larger than the
original size. TTL LRU queue missed that while still subtracting the
larger size when entries were removed from the queue, causing the size
metric negative overflow.
## Proposed changes

I cherry-picked the regression test cases for issues fixed in branch-2.0
and branch-2.1 to ensure the correctness of the master branch.
apache#41200 apache#41297 apache#40425 apache#40630
…dir when index file writer open index (apache#42207)

## Proposed changes

Fix UAF as below
```
==16442==ERROR: AddressSanitizer: heap-use-after-free on address 0x60f0008718a0 at pc 0x5586663cea7a bp 0x7f02673d0530 sp 0x7f02673d0528
READ of size 8 at 0x60f0008718a0 thread T490 (CumuCompactionT)
    #0 0x5586663cea79 in doris::segment_v2::InvertedIndexColumnWriterImpl<(doris::FieldType)5>::finish() /home/zcp/repo_center/doris_master/doris/be/src/olap/rowset/segment_v2/inverted_index_writer.cpp:549:63
    #1 0x5586663360bc in doris::segment_v2::ScalarColumnWriter::write_inverted_index() /home/zcp/repo_center/doris_master/doris/be/src/olap/rowset/segment_v2/column_writer.cpp:640:41
    #2 0x5586662ce160 in doris::segment_v2::SegmentWriter::_write_inverted_index() /home/zcp/repo_center/doris_master/doris/be/src/olap/rowset/segment_v2/segment_writer.cpp:1435:9
    #3 0x5586662ccb13 in doris::segment_v2::SegmentWriter::finalize_columns_index(unsigned long*) /home/zcp/repo_center/doris_master/doris/be/src/olap/rowset/segment_v2/segment_writer.cpp:1317:5
    #4 0x55866653c074 in doris::VerticalBetaRowsetWriter::_flush_columns(doris::segment_v2::SegmentWriter*, bool) /home/zcp/repo_center/doris_master/doris/be/src/olap/rowset/vertical_beta_rowset_writer.cpp:127:5
    #5 0x558666534e3f in doris::VerticalBetaRowsetWriter::flush_columns(bool) /home/zcp/repo_center/doris_master/doris/be/src/olap/rowset/vertical_beta_rowset_writer.cpp:153:5
    #6 0x558665a8d8b2 in doris::Merger::vertical_compact_one_group(std::shared_ptr, doris::ReaderType, doris::TabletSchema const&, bool, std::vector> const&, doris::vectorized::RowSourcesBuffer*, std::vector, std::allocator>> const&, doris::RowsetWriter*, long, doris::Merger::Statistics*, std::vector>, long, doris::CompactionSampleInfo*) /home/zcp/repo_center/doris_master/doris/be/src/olap/merger.cpp:333:5
    #7 0x558665a93190 in doris::Merger::vertical_merge_rowsets(std::shared_ptr, doris::ReaderType, doris::TabletSchema const&, std::vector, std::allocator>> const&, doris::RowsetWriter*, long, long, doris::Merger::Statistics*) /home/zcp/repo_center/doris_master/doris/be/src/olap/merger.cpp:471:21
    #8 0x558665a00fd7 in doris::Compaction::merge_input_rowsets() /home/zcp/repo_center/doris_master/doris/be/src/olap/compaction.cpp:192:19
    #9 0x558665a329a3 in doris::CloudCompactionMixin::execute_compact_impl(long) /home/zcp/repo_center/doris_master/doris/be/src/olap/compaction.cpp:1158:5
    #10 0x558665a33381 in doris::CloudCompactionMixin::execute_compact() /home/zcp/repo_center/doris_master/doris/be/src/olap/compaction.cpp:1173:5
    #11 0x55869a893d89 in doris::CloudCumulativeCompaction::execute_compact() /home/zcp/repo_center/doris_master/doris/be/src/cloud/cloud_cumulative_compaction.cpp:191:38
    #12 0x55869a861fac in doris::CloudStorageEngine::_submit_cumulative_compaction_task(std::shared_ptr const&)::$_1::operator()() const /home/zcp/repo_center/doris_master/doris/be/src/cloud/cloud_storage_engine.cpp:693:31
    #13 0x55869a861fac in void std::__invoke_impl const&)::$_1&>(std::__invoke_other, doris::CloudStorageEngine::_submit_cumulative_compaction_task(std::shared_ptr const&)::$_1&) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:61:14
    #14 0x55869a861fac in std::enable_if const&)::$_1&>, void>::type std::__invoke_r const&)::$_1&>(doris::CloudStorageEngine::_submit_cumulative_compaction_task(std::shared_ptr const&)::$_1&) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:111:2
    #15 0x55869a861fac in std::_Function_handler const&)::$_1>::_M_invoke(std::_Any_data const&) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:291:9
    #16 0x5586679033fb in doris::ThreadPool::dispatch_thread() /home/zcp/repo_center/doris_master/doris/be/src/util/threadpool.cpp:543:24
    #17 0x5586678db057 in doris::Thread::supervise_thread(void*) /home/zcp/repo_center/doris_master/doris/be/src/util/thread.cpp:498:5
    #18 0x7f0639d7bac2 in start_thread nptl/pthread_create.c:442:8
    #19 0x7f0639e0d84f  misc/../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

0x60f0008718a0 is located 0 bytes inside of 168-byte region [0x60f0008718a0,0x60f000871948)
freed by thread T490 (CumuCompactionT) here:
    #0 0x558663259d9d in operator delete(void*) (/mnt/hdd01/ci/master-deploy/cluster0/be/lib/doris_be+0x337a1d9d) (BuildId: fa8094411569cb8d)
    #1 0x558665a655eb in std::default_delete::operator()(lucene::store::Directory*) const /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/unique_ptr.h:85:2
    #2 0x558665a655eb in std::unique_ptr>::~unique_ptr() /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/unique_ptr.h:361:4
    #3 0x558665f83dfe in std::pair, std::allocator>> const, std::unique_ptr>>::~pair() /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_iterator.h:2379:12
    #4 0x558665f83dfe in void std::destroy_at, std::allocator>> const, std::unique_ptr>>>(std::pair, std::allocator>> const, std::unique_ptr>>*) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_construct.h:88:15
    #5 0x558665f83dfe in void std::allocator_traits, std::allocator>> const, std::unique_ptr>>>>>::destroy, std::allocator>> const, std::unique_ptr>>>(std::allocator, std::allocator>> const, std::unique_ptr>>>>&, std::pair, std::allocator>> const, std::unique_ptr>>*) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/alloc_traits.h:533:4
    #6 0x558665f83dfe in std::_Rb_tree, std::allocator>>, std::pair, std::allocator>> const, std::unique_ptr>>, std::_Select1st, std::allocator>> const, std::unique_ptr>>>, std::less, std::allocator>>>, std::allocator, std::allocator>> const, std::unique_ptr>>>>::_M_destroy_node(std::_Rb_tree_node, std::allocator>> const, std::unique_ptr>>>*) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_tree.h:623:2
    #7 0x558665f83dfe in std::_Rb_tree, std::allocator>>, std::pair, std::allocator>> const, std::unique_ptr>>, std::_Select1st, std::allocator>> const, std::unique_ptr>>>, std::less, std::allocator>>>, std::allocator, std::allocator>> const, std::unique_ptr>>>>::_M_drop_node(std::_Rb_tree_node, std::allocator>> const, std::unique_ptr>>>*) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_tree.h:631:2
    #8 0x558665f9e9b8 in std::pair, std::allocator>> const, std::unique_ptr>>>, bool> std::_Rb_tree, std::allocator>>, std::pair, std::allocator>> const, std::unique_ptr>>, std::_Select1st, std::allocator>> const, std::unique_ptr>>>, std::less, std::allocator>>>, std::allocator, std::allocator>> const, std::unique_ptr>>>>::_M_emplace_unique, std::allocator>>, std::unique_ptr>>(std::pair, std::allocator>>&&, std::unique_ptr>&&) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_tree.h:2391:6
    #9 0x558665f86e29 in std::pair, std::allocator>> const, std::unique_ptr>>>, bool> std::map, std::allocator>>, std::unique_ptr>, std::less, std::allocator>>>, std::allocator, std::allocator>> const, std::unique_ptr>>>>::emplace, std::allocator>>, std::unique_ptr>>(std::pair, std::allocator>>&&, std::unique_ptr>&&) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/stl_map.h:577:16
    #10 0x558665f86e29 in doris::segment_v2::InvertedIndexFileWriter::open(doris::TabletIndex const*) /home/zcp/repo_center/doris_master/doris/be/src/olap/rowset/segment_v2/inverted_index_file_writer.cpp:60:19
    #11 0x5586663d0423 in doris::segment_v2::InvertedIndexColumnWriterImpl<(doris::FieldType)5>::open_index_directory() /home/zcp/repo_center/doris_master/doris/be/src/olap/rowset/segment_v2/inverted_index_writer.cpp:160:16
    #12 0x5586663d00f7 in doris::segment_v2::InvertedIndexColumnWriterImpl<(doris::FieldType)5>::init_bkd_index() /home/zcp/repo_center/doris_master/doris/be/src/olap/rowset/segment_v2/inverted_index_writer.cpp:146:16
    #13 0x5586663cb2d9 in doris::segment_v2::InvertedIndexColumnWriterImpl<(doris::FieldType)5>::init() /home/zcp/repo_center/doris_master/doris/be/src/olap/rowset/segment_v2/inverted_index_writer.cpp:107:24
    #14 0x5586663b1201 in doris::segment_v2::InvertedIndexColumnWriter::create(doris::Field const*, std::unique_ptr>*, doris::segment_v2::InvertedIndexFileWriter*, doris::TabletIndex const*) /home/zcp/repo_center/doris_master/doris/be/src/olap/rowset/segment_v2/inverted_index_writer.cpp:697:27
    #15 0x55866632cf8d in doris::segment_v2::ScalarColumnWriter::init() /home/zcp/repo_center/doris_master/doris/be/src/olap/rowset/segment_v2/column_writer.cpp:483:13
    #16 0x5586662a5e52 in doris::segment_v2::SegmentWriter::_create_column_writer(unsigned int, doris::TabletColumn const&, std::shared_ptr const&) /home/zcp/repo_center/doris_master/doris/be/src/olap/rowset/segment_v2/segment_writer.cpp:267:5
    #17 0x5586662a71b4 in doris::segment_v2::SegmentWriter::_create_writers(std::shared_ptr const&, std::vector> const&) /home/zcp/repo_center/doris_master/doris/be/src/olap/rowset/segment_v2/segment_writer.cpp:316:9
    #18 0x5586662a2be3 in doris::segment_v2::SegmentWriter::init(std::vector> const&, bool) /home/zcp/repo_center/doris_master/doris/be/src/olap/rowset/segment_v2/segment_writer.cpp:285:5
    #19 0x5586665333b2 in doris::VerticalBetaRowsetWriter::add_columns(doris::vectorized::Block const*, std::vector> const&, bool, unsigned int) /home/zcp/repo_center/doris_master/doris/be/src/olap/rowset/vertical_beta_rowset_writer.cpp:96:17
    #20 0x558665a8cb9b in doris::Merger::vertical_compact_one_group(std::shared_ptr, doris::ReaderType, doris::TabletSchema const&, bool, std::vector> const&, doris::vectorized::RowSourcesBuffer*, std::vector, std::allocator>> const&, doris::RowsetWriter*, long, doris::Merger::Statistics*, std::vector>, long, doris::CompactionSampleInfo*) /home/zcp/repo_center/doris_master/doris/be/src/olap/merger.cpp:309:9
    #21 0x558665a93190 in doris::Merger::vertical_merge_rowsets(std::shared_ptr, doris::ReaderType, doris::TabletSchema const&, std::vector, std::allocator>> const&, doris::RowsetWriter*, long, long, doris::Merger::Statistics*) /home/zcp/repo_center/doris_master/doris/be/src/olap/merger.cpp:471:21
    #22 0x558665a00fd7 in doris::Compaction::merge_input_rowsets() /home/zcp/repo_center/doris_master/doris/be/src/olap/compaction.cpp:192:19
    #23 0x558665a329a3 in doris::CloudCompactionMixin::execute_compact_impl(long) /home/zcp/repo_center/doris_master/doris/be/src/olap/compaction.cpp:1158:5
    #24 0x558665a33381 in doris::CloudCompactionMixin::execute_compact() /home/zcp/repo_center/doris_master/doris/be/src/olap/compaction.cpp:1173:5
    #25 0x55869a893d89 in doris::CloudCumulativeCompaction::execute_compact() /home/zcp/repo_center/doris_master/doris/be/src/cloud/cloud_cumulative_compaction.cpp:191:38
    #26 0x55869a861fac in doris::CloudStorageEngine::_submit_cumulative_compaction_task(std::shared_ptr const&)::$_1::operator()() const /home/zcp/repo_center/doris_master/doris/be/src/cloud/cloud_storage_engine.cpp:693:31
    #27 0x55869a861fac in void std::__invoke_impl const&)::$_1&>(std::__invoke_other, doris::CloudStorageEngine::_submit_cumulative_compaction_task(std::shared_ptr const&)::$_1&) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:61:14
    #28 0x55869a861fac in std::enable_if const&)::$_1&>, void>::type std::__invoke_r const&)::$_1&>(doris::CloudStorageEngine::_submit_cumulative_compaction_task(std::shared_ptr const&)::$_1&) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:111:2
    #29 0x55869a861fac in std::_Function_handler const&)::$_1>::_M_invoke(std::_Any_data const&) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:291:9
    #30 0x5586679033fb in doris::ThreadPool::dispatch_thread() /home/zcp/repo_center/doris_master/doris/be/src/util/threadpool.cpp:543:24
    #31 0x5586678db057 in doris::Thread::supervise_thread(void*) /home/zcp/repo_center/doris_master/doris/be/src/util/thread.cpp:498:5
    #32 0x7f0639d7bac2 in start_thread nptl/pthread_create.c:442:8

previously allocated by thread T490 (CumuCompactionT) here:
    #0 0x55866325953d in operator new(unsigned long) (/mnt/hdd01/ci/master-deploy/cluster0/be/lib/doris_be+0x337a153d) (BuildId: fa8094411569cb8d)
    #1 0x558665fb7d7b in doris::segment_v2::DorisFSDirectoryFactory::getDirectory(std::shared_ptr const&, char const*, bool, lucene::store::LockFactory*) /home/zcp/repo_center/doris_master/doris/be/src/olap/rowset/segment_v2/inverted_index_fs_directory.cpp:797:15
    #2 0x558665f86b5c in doris::segment_v2::InvertedIndexFileWriter::open(doris::TabletIndex const*) /home/zcp/repo_center/doris_master/doris/be/src/olap/rowset/segment_v2/inverted_index_file_writer.cpp:58:17
    #3 0x5586663d0423 in doris::segment_v2::InvertedIndexColumnWriterImpl<(doris::FieldType)5>::open_index_directory() /home/zcp/repo_center/doris_master/doris/be/src/olap/rowset/segment_v2/inverted_index_writer.cpp:160:16
    #4 0x5586663d00f7 in doris::segment_v2::InvertedIndexColumnWriterImpl<(doris::FieldType)5>::init_bkd_index() /home/zcp/repo_center/doris_master/doris/be/src/olap/rowset/segment_v2/inverted_index_writer.cpp:146:16
    #5 0x5586663cb2d9 in doris::segment_v2::InvertedIndexColumnWriterImpl<(doris::FieldType)5>::init() /home/zcp/repo_center/doris_master/doris/be/src/olap/rowset/segment_v2/inverted_index_writer.cpp:107:24
    #6 0x5586663b1201 in doris::segment_v2::InvertedIndexColumnWriter::create(doris::Field const*, std::unique_ptr>*, doris::segment_v2::InvertedIndexFileWriter*, doris::TabletIndex const*) /home/zcp/repo_center/doris_master/doris/be/src/olap/rowset/segment_v2/inverted_index_writer.cpp:697:27
    #7 0x55866632cf8d in doris::segment_v2::ScalarColumnWriter::init() /home/zcp/repo_center/doris_master/doris/be/src/olap/rowset/segment_v2/column_writer.cpp:483:13
    #8 0x5586662a5e52 in doris::segment_v2::SegmentWriter::_create_column_writer(unsigned int, doris::TabletColumn const&, std::shared_ptr const&) /home/zcp/repo_center/doris_master/doris/be/src/olap/rowset/segment_v2/segment_writer.cpp:267:5
    #9 0x5586662a71b4 in doris::segment_v2::SegmentWriter::_create_writers(std::shared_ptr const&, std::vector> const&) /home/zcp/repo_center/doris_master/doris/be/src/olap/rowset/segment_v2/segment_writer.cpp:316:9
    #10 0x5586662a2be3 in doris::segment_v2::SegmentWriter::init(std::vector> const&, bool) /home/zcp/repo_center/doris_master/doris/be/src/olap/rowset/segment_v2/segment_writer.cpp:285:5
    #11 0x5586665333b2 in doris::VerticalBetaRowsetWriter::add_columns(doris::vectorized::Block const*, std::vector> const&, bool, unsigned int) /home/zcp/repo_center/doris_master/doris/be/src/olap/rowset/vertical_beta_rowset_writer.cpp:96:17
    #12 0x558665a8cb9b in doris::Merger::vertical_compact_one_group(std::shared_ptr, doris::ReaderType, doris::TabletSchema const&, bool, std::vector> const&, doris::vectorized::RowSourcesBuffer*, std::vector, std::allocator>> const&, doris::RowsetWriter*, long, doris::Merger::Statistics*, std::vector>, long, doris::CompactionSampleInfo*) /home/zcp/repo_center/doris_master/doris/be/src/olap/merger.cpp:309:9
    #13 0x558665a93190 in doris::Merger::vertical_merge_rowsets(std::shared_ptr, doris::ReaderType, doris::TabletSchema const&, std::vector, std::allocator>> const&, doris::RowsetWriter*, long, long, doris::Merger::Statistics*) /home/zcp/repo_center/doris_master/doris/be/src/olap/merger.cpp:471:21
    #14 0x558665a00fd7 in doris::Compaction::merge_input_rowsets() /home/zcp/repo_center/doris_master/doris/be/src/olap/compaction.cpp:192:19
    #15 0x558665a329a3 in doris::CloudCompactionMixin::execute_compact_impl(long) /home/zcp/repo_center/doris_master/doris/be/src/olap/compaction.cpp:1158:5
    #16 0x558665a33381 in doris::CloudCompactionMixin::execute_compact() /home/zcp/repo_center/doris_master/doris/be/src/olap/compaction.cpp:1173:5
    #17 0x55869a893d89 in doris::CloudCumulativeCompaction::execute_compact() /home/zcp/repo_center/doris_master/doris/be/src/cloud/cloud_cumulative_compaction.cpp:191:38
    #18 0x55869a861fac in doris::CloudStorageEngine::_submit_cumulative_compaction_task(std::shared_ptr const&)::$_1::operator()() const /home/zcp/repo_center/doris_master/doris/be/src/cloud/cloud_storage_engine.cpp:693:31
    #19 0x55869a861fac in void std::__invoke_impl const&)::$_1&>(std::__invoke_other, doris::CloudStorageEngine::_submit_cumulative_compaction_task(std::shared_ptr const&)::$_1&) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:61:14
    #20 0x55869a861fac in std::enable_if const&)::$_1&>, void>::type std::__invoke_r const&)::$_1&>(doris::CloudStorageEngine::_submit_cumulative_compaction_task(std::shared_ptr const&)::$_1&) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:111:2
    #21 0x55869a861fac in std::_Function_handler const&)::$_1>::_M_invoke(std::_Any_data const&) /var/local/ldb-toolchain/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:291:9
    #22 0x5586679033fb in doris::ThreadPool::dispatch_thread() /home/zcp/repo_center/doris_master/doris/be/src/util/threadpool.cpp:543:24
    #23 0x5586678db057 in doris::Thread::supervise_thread(void*) /home/zcp/repo_center/doris_master/doris/be/src/util/thread.cpp:498:5
    #24 0x7f0639d7bac2 in start_thread nptl/pthread_create.c:442:8
```
…42183)

1. change api dependency to counted finish dependency to reduce the
unless cast
2. change `CountedFinishDependency` from struct to class
…hdown and implicit conversion pushdown (apache#42102)

Add a variable `enable_jdbc_cast_predicate_push_down`, the default value
is false, which prohibits the pushdown of non-constant predicates with
type conversion and all predicates with implicit conversion. This change
can prevent the wrong predicates from being pushed down to the Jdbc data
source, resulting in query data errors, because the predicates with cast
were not correctly pushed down to the data source before.
If you find that the data is read correctly and the performance is
better before this change, you can manually set this variable to true

```
| Expression                                          | Can Push Down |
|-----------------------------------------------------|---------------|
| column type equals const type                       | Yes           |
| column type equals cast const type                  | Yes           |
| cast column type equals const type                  | No            |
| cast column type equals cast const type             | No            |
| column type not equals column type                  | No            |
| column type not equals cast const type              | No            |
| cast column type not equals const type              | No            |
| cast column type not equals cast const type         | No            |

```
…ache#42039)

## Proposed changes
Currently, an upstream BE (sink_v2) will open multiple streams to a
downstream BE (load_stream).
If any of the streams fails, the use_cnt on the downstream BE will be
messed up.
The load_stream will not report any success tablets to the sink_v2 since
in its view there are still unfinished streams.

So fault tolerance when open streams is not meaningful in practical, and
may cause data lost.
i.e. Upstream think there is still working streams to transfer data, but
downstream does not report any commit info.

This PR removes fault tolerance when open multiple streams to the same
backend.
If any of the open fails, the upstream sink_v2 should mark the
downstream BE as failed replicas.
BiteTheDDDDt and others added 24 commits November 1, 2024 18:04
…hodKeysFixed (apache#43005)

## Proposed changes
remove has_nullable_keys template argument of MethodKeysFixed
…he#43051)

### What problem does this PR solve?

Related PR: apache#42950

Problem Summary:

PR apache#42950 change some logic in ExportJob, by removing the
`taskIdToExecutor`, which is
a thread safe ConcurrentHashMap.
But there is a problem that, when cancelling a export job, it will clear
the `jobExecutorList` in ExportJob,
and meanwhile, this `jobExecutorList` may being traversed when creating
the export job,
causing concurrent modification exception.

This PR fix it by locking the writeLock of ExportMgr when cancelling the
export job.
…map catch exception (apache#41759)

Now mow table lock is released on ms when doing commit txn, however if
calculate delete bitmap failed before commiting txn, this lock will not
release which will lead to another loading task hang on geting mow lock
until this lock is expired on last txn.
Boost characterLiteralTypeCoercion by check the string format and skip
throw Throwable when can not parse string to
integer/float/date/datetime.

This logical usually appear when search `If` function signature, because
the If has lots of signature, we need cast argument to the signature
type to matches the best signature, for example:

```
select if(column_1, 'xxx', 'yyy')
```
we will check whether the 'xxx' and 'yyy' can be parsed to int/datetime
and so on.



In some scenarios, this optimize can provide 16% QPS

before: optimize:
<img width="1901" alt="image"
src="https://github.com/user-attachments/assets/b03d2d29-5d3b-45a6-ba54-2bcc7c2dccca">

<img width="1484" alt="image"
src="https://github.com/user-attachments/assets/82cbb2b0-dfe8-4a05-bc2f-ebb35dc23209">

after optimize:
<img width="1724" alt="image"
src="https://github.com/user-attachments/assets/d60a867d-596d-4ac1-9377-6460ed6d3dd1">

<img width="1722" alt="image"
src="https://github.com/user-attachments/assets/c9c9f72c-3a5f-4c24-95d9-9ca99ecab0a6">
Our workmates run doris compose on the same dev machine. Sometimes some
docker doris_be consume high cpu. We want to find out it belongs to who.
But if show processes with command 'ps -elf' on host machine, we only
find out that its owner is root.
What a pitty. So we let doris compose docker run as host user. Then use
command 'ps -elf' can known who it belongs to.

To achive this goal, we pass host user and uid into docker container,
and let container add a new user the same with it, then run command with
this new account.

If user still want to run as root in container, they need to pass
options '--root' when creating a new doris compose cluster.

```
# run as root in container
python doris-compose.py  up   --root ...

#run as host user in container
python doris-compose.py up ...
```
…rowset schema in backend side (apache#41482)

1. in local mode we limit column size in commit_txn phase, in cloud mode
we limit it in `write_schema_dict` which in commit_rowset phase
2. add config to decide whether fill schema in BE side or MS side
## Proposed changes
fellow apache#42200

Null values ​​in text tables are now correctly recognized now.
…Resource Release (apache#43059)

This pull request optimizes the JDBC connection and resource management
in `JdbcClient` and related methods. The main changes are as follows:

- **Manual Resource Closing**: Explicitly place resource acquisition and
closing within a `try-catch-finally` structure. In the `finally` block,
manually close resources like `Connection`, `Statement`, and `ResultSet`
to ensure that resources are properly released even if an exception
occurs.

- **Remove try-with-resources**: All usages of `try-with-resources` have
been removed, switching to manual resource management instead.

- **Improve Exception Handling**: When closing resources, if an
exception occurs, log a warning but do not throw a new exception. This
avoids masking the original exception during the resource release phase.

- **Unified Resource Management**: Ensure consistent resource
acquisition and closing methods across all functions, improving code
readability and maintainability.

Functions modified include but are not limited to:

- `executeStmt`
- `getColumnsFromQuery`
- `getDatabaseNameList`
- `getJdbcColumnsInfo`
- `processTable`
- `getColumnsDataTypeUseQuery`

These changes enhance the robustness of the code, prevent potential
resource leaks, and ensure that all JDBC connections and resources are
properly closed without using `try-with-resources`.
1. fe use 5005 as jvm debug port, user can run `docker ps | grep
<fe-node-name>` to find out the port on host machine;
2.  extends get-cloud-ini command to print more info
apache#42865)

…tinguishing clusters

## Proposed changes

Issue Number: close #xxx

<!--Describe your changes.-->

replace function getAllBackendIds by getBackendsByCurrentCluster
…nt (apache#43016)

Add cases to ensure the correctness of the document.
### What problem does this PR solve?

Problem Summary:
```
2024-11-01 19:42:52,521 WARN (mysql-nio-pool-117|9514) [StmtExecutor.execute():616] Analyze failed. stmt[250257, 59c581a512e7468f-b1cfd7d4b63fed33]
org.apache.doris.common.NereidsException: errCode = 2, detailMessage = java.util.ConcurrentModificationException
        at org.apache.doris.qe.StmtExecutor.executeByNereids(StmtExecutor.java:780) ~[doris-fe.jar:1.2-SNAPSHOT]
        at org.apache.doris.qe.StmtExecutor.execute(StmtExecutor.java:601) ~[doris-fe.jar:1.2-SNAPSHOT]
        at org.apache.doris.qe.StmtExecutor.queryRetry(StmtExecutor.java:564) ~[doris-fe.jar:1.2-SNAPSHOT]
        at org.apache.doris.qe.StmtExecutor.execute(StmtExecutor.java:554) ~[doris-fe.jar:1.2-SNAPSHOT]
        at org.apache.doris.qe.ConnectProcessor.executeQuery(ConnectProcessor.java:340) ~[doris-fe.jar:1.2-SNAPSHOT]
        at org.apache.doris.qe.ConnectProcessor.handleQuery(ConnectProcessor.java:243) ~[doris-fe.jar:1.2-SNAPSHOT]
        at org.apache.doris.qe.MysqlConnectProcessor.handleQuery(MysqlConnectProcessor.java:208) ~[doris-fe.jar:1.2-SNAPSHOT]
        at org.apache.doris.qe.MysqlConnectProcessor.dispatch(MysqlConnectProcessor.java:236) ~[doris-fe.jar:1.2-SNAPSHOT]
        at org.apache.doris.qe.MysqlConnectProcessor.processOnce(MysqlConnectProcessor.java:413) ~[doris-fe.jar:1.2-SNAPSHOT]
        at org.apache.doris.mysql.ReadListener.lambda$handleEvent$0(ReadListener.java:52) ~[doris-fe.jar:1.2-SNAPSHOT]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) ~[?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) ~[?:?]
        at java.lang.Thread.run(Thread.java:840) ~[?:?]
Caused by: org.apache.doris.common.AnalysisException: errCode = 2, detailMessage = java.util.ConcurrentModificationException
        ... 13 more
Caused by: java.util.ConcurrentModificationException
        at java.util.ArrayList.forEach(ArrayList.java:1513) ~[?:?]
        at org.apache.doris.load.ExportMgr.addExportJobAndRegisterTask(ExportMgr.java:120) ~[doris-fe.jar:1.2-SNAPSHOT]
        at org.apache.doris.nereids.trees.plans.commands.ExportCommand.run(ExportCommand.java:149) ~[doris-fe.jar:1.2-SNAPSHOT]
        at org.apache.doris.qe.StmtExecutor.executeByNereids(StmtExecutor.java:749) ~[doris-fe.jar:1.2-SNAPSHOT]
        ... 12 more
```

### Check List (For Committer)

- Test <!-- At least one of them must be included. -->

    - [ ] Regression test
    - [ ] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [x] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
        - [x] Previous test can cover this change.
        - [ ] No colde files have been changed.
        - [ ] Other reason <!-- Add your reason?  -->

- Behavior changed:

    - [x] No.
    - [ ] Yes. <!-- Explain the behavior change -->

- Does this need documentation?

    - [ ] No.
- [ ] Yes. <!-- Add document PR link here. eg:
apache/doris-website#1214 -->

- Release note

    <!-- bugfix, feat, behavior changed need a release note -->
    <!-- Add one line release note for this PR. -->
    None

### Check List (For Reviewer who merge this PR)

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->
…3079)

The syntax is:

```
KW_DROP opt_tmp:isTempPartition KW_PARTITION ...
```
### What problem does this PR solve?
Problem Summary:
There are 2 locks, lock of ExportMgr and lock of ExportJob(synchronized
lock).
Previously, the lock order is wrong:
1. When cancelling job, it will first get job lock, then getting mgr
lock.
2. When removing old job, it will first get mgr lock, then getting job
lock.

This PR fix it by always getting job lock after mgr lock, to avoid dead lock.
…ss to OSS-HDFS (apache#42585)

Fix the problem that paimon catalog can not access to OSS-HDFS.

There are 2 problems in paimon catalog:
1. Doris FE can not list paimon tables.
This is because we pass these three properties -- `fs.oss.endpoint /
fs.oss.accessKeyId / fs.oss.accessKeySecret` -- to the PaimonCatalog.
When PaimonCatalog get these three properties, it will use `OSSLoader`
rather than `HadoopFileIOLoader`.

2. Doris BE does not use libhdfs to access OSS-HDFS
This is because the `tmpLocation` in `LocationPath` does not contain
`oss-dls.aliyuncs`. We should use `endpoint` to judge if user wants to
access OSS-HDFS

What's more, if you want to access OSS-HDFS with PaimonCatalog, you
should:
1. Download Jindo SDK:
https://github.com/aliyun/alibabacloud-jindodata/blob/latest/docs/user/zh/jindosdk/jindosdk_download.md
2. copy `jindo-core.jar、jindo-sdk.jar` to `${DORIS_HOME}/fe/lib` and
`${DORIS_HOME}/be/lib/java_extensions/preload-extensions` directory.
…he#43049)

### What problem does this PR solve?

1. Different remoteFs should correspond to different nativeFs.
2. If it is s3,  we do not need to delete the stage directory.
3. When an error occurs when deleting a directory, we need to roll back.
…error (apache#42262)

### What problem does this PR solve?
Problem Summary:

#### Issue 1
There are some fields that is missing in audit log table.
This PR add them all:
- shuffle_send_rows
- shuffle_send_bytes
- scan_bytes_from_local_storage
- scan_bytes_from_remote_storage
- is_nereids
- compute_group

Notice that `compute_group` is previously name `cloudClusterName` in
fe.audit.log,
which is incorrect, so I change it to the right name.

After this PR, all these fields will be saved in both audit log table
and fe.audit.log

#### Issue 2
The `AuditEventBuilder` need to be reset at each run, the there will be
duplicate audit log.

#### Issue 3
Add a new statement `call flush_audit_log()`. It will flush the audit
log immediately to audit_log table.
This is useful in test case, so that we don't need to wait 1min to flush
the audit log data.

### Release note

[fix](auditlog) add missing audit log fields and duplicate audit log error
- removing unnecessary comments and using a temporary directory for the
Git clone operation.
## Proposed changes

Issue Number: close #xxx

apache#42072 missed a case
…ache#42674)

Support get row count for pg and sql server. Get the row count value
through the databases' statistics info.
…e#38946)

## Proposed changes

1. add session variable: `use_consistent_hash_for_external_scan`, which
can specify consistent hash for external scan.
2. add session variable: `ignore_split_type`, which can ignore splits of
the specified type, use for performance tuning.
3. add split weight for paimon split with consistent hash.
4. add `executeFilter` for paimon jni split.
Copy link

github-actions bot commented Nov 4, 2024

sh-checker report

To get the full details, please check in the job output.

shellcheck errors

'shellcheck ' returned error 1 finding the following syntactical issues:

----------

In samples/datalake/deltalake_and_kudu/start-trinoconnector-compose.sh line 162:
export KUDU_QUICKSTART_IP=$(ifconfig | grep "inet " | grep -Fv 127.0.0.1 |  awk '{print $2}' | tail -1)
       ^----------------^ SC2155 (warning): Declare and assign separately to avoid masking return values.


In samples/datalake/deltalake_and_kudu/stop-trinoconnector-compose.sh line 20:
export KUDU_QUICKSTART_IP=$(ifconfig | grep "inet " | grep -Fv 127.0.0.1 |  awk '{print $2}' | tail -1)
       ^----------------^ SC2155 (warning): Declare and assign separately to avoid masking return values.

For more information:
  https://www.shellcheck.net/wiki/SC2155 -- Declare and assign separately to ...
----------

You can address the above issues in one of three ways:
1. Manually correct the issue in the offending shell script;
2. Disable specific issues by adding the comment:
  # shellcheck disable=NNNN
above the line that contains the issue, where NNNN is the error code;
3. Add '-e NNNN' to the SHELLCHECK_OPTS setting in your .yml action file.



shfmt errors

'shfmt ' returned error 1 finding the following formatting issues:

----------
--- samples/datalake/deltalake_and_kudu/scripts/start_doris.sh.orig
+++ samples/datalake/deltalake_and_kudu/scripts/start_doris.sh
@@ -20,8 +20,8 @@
 export JAVA_HOME=/opt/jdk-17.0.2
 
 cp -r /opt/doris-bin /opt/doris
-echo "trino_connector_plugin_dir=/opt/connectors/" >> /opt/doris/fe/conf/fe.conf
-echo "trino_connector_plugin_dir=/opt/connectors/" >> /opt/doris/be/conf/be.conf
+echo "trino_connector_plugin_dir=/opt/connectors/" >>/opt/doris/fe/conf/fe.conf
+echo "trino_connector_plugin_dir=/opt/connectors/" >>/opt/doris/be/conf/be.conf
 
 /opt/doris/fe/bin/start_fe.sh --daemon
 /opt/doris/be/bin/start_be.sh --daemon
--- samples/datalake/deltalake_and_kudu/start-trinoconnector-compose.sh.orig
+++ samples/datalake/deltalake_and_kudu/start-trinoconnector-compose.sh
@@ -33,7 +33,6 @@
 hdfs_plugin="ff4a3e3b32dcce27f4df58f17938abde"
 kudu_java_example="1afe0a890785e8d0011ea7342ae5e43d"
 
-
 download_source_file() {
     local FILE_PATH="$1"
     local EXPECTED_MD5="$2"
@@ -79,9 +78,6 @@
 download_source_file "trino-hdfs-435-20240724.tar.gz" "${hdfs_plugin}" "https://github.com/apache/doris-thirdparty/releases/download/trino-435-20240724"
 download_source_file "kudu-java-example-1.0-SNAPSHOT.jar" "${kudu_java_example}" "https://github.com/apache/doris-thirdparty/releases/download/trino-435-20240724"
 
-
-
-
 if [[ ! -f "jdk-17.0.2/SUCCESS" ]]; then
     echo "Prepare jdk17 environment"
     if [[ -d "jdk-17.0.2" ]]; then
@@ -156,10 +152,9 @@
     touch connectors/trino-delta-lake-435/hdfs/SUCCESS
 fi
 
-
 cd ../
 
-export KUDU_QUICKSTART_IP=$(ifconfig | grep "inet " | grep -Fv 127.0.0.1 |  awk '{print $2}' | tail -1)
+export KUDU_QUICKSTART_IP=$(ifconfig | grep "inet " | grep -Fv 127.0.0.1 | awk '{print $2}' | tail -1)
 
 docker compose -f trinoconnector-compose.yml --env-file trinoconnector-compose.env up -d
 echo "Create hive table ..."
--- samples/datalake/deltalake_and_kudu/stop-trinoconnector-compose.sh.orig
+++ samples/datalake/deltalake_and_kudu/stop-trinoconnector-compose.sh
@@ -17,6 +17,6 @@
 # specific language governing permissions and limitations
 # under the License.
 
-export KUDU_QUICKSTART_IP=$(ifconfig | grep "inet " | grep -Fv 127.0.0.1 |  awk '{print $2}' | tail -1)
+export KUDU_QUICKSTART_IP=$(ifconfig | grep "inet " | grep -Fv 127.0.0.1 | awk '{print $2}' | tail -1)
 
 docker compose -f trinoconnector-compose.yml --env-file trinoconnector-compose.env down
----------

You can reformat the above files to meet shfmt's requirements by typing:

  shfmt  -w filename


Copy link

github-actions bot commented Nov 4, 2024

sh-checker report

To get the full details, please check in the job output.

shellcheck errors

'shellcheck ' returned error 1 finding the following syntactical issues:

----------

In samples/datalake/deltalake_and_kudu/start-trinoconnector-compose.sh line 162:
export KUDU_QUICKSTART_IP=$(ifconfig | grep "inet " | grep -Fv 127.0.0.1 |  awk '{print $2}' | tail -1)
       ^----------------^ SC2155 (warning): Declare and assign separately to avoid masking return values.


In samples/datalake/deltalake_and_kudu/stop-trinoconnector-compose.sh line 20:
export KUDU_QUICKSTART_IP=$(ifconfig | grep "inet " | grep -Fv 127.0.0.1 |  awk '{print $2}' | tail -1)
       ^----------------^ SC2155 (warning): Declare and assign separately to avoid masking return values.

For more information:
  https://www.shellcheck.net/wiki/SC2155 -- Declare and assign separately to ...
----------

You can address the above issues in one of three ways:
1. Manually correct the issue in the offending shell script;
2. Disable specific issues by adding the comment:
  # shellcheck disable=NNNN
above the line that contains the issue, where NNNN is the error code;
3. Add '-e NNNN' to the SHELLCHECK_OPTS setting in your .yml action file.



shfmt errors

'shfmt ' returned error 1 finding the following formatting issues:

----------
--- samples/datalake/deltalake_and_kudu/scripts/start_doris.sh.orig
+++ samples/datalake/deltalake_and_kudu/scripts/start_doris.sh
@@ -20,8 +20,8 @@
 export JAVA_HOME=/opt/jdk-17.0.2
 
 cp -r /opt/doris-bin /opt/doris
-echo "trino_connector_plugin_dir=/opt/connectors/" >> /opt/doris/fe/conf/fe.conf
-echo "trino_connector_plugin_dir=/opt/connectors/" >> /opt/doris/be/conf/be.conf
+echo "trino_connector_plugin_dir=/opt/connectors/" >>/opt/doris/fe/conf/fe.conf
+echo "trino_connector_plugin_dir=/opt/connectors/" >>/opt/doris/be/conf/be.conf
 
 /opt/doris/fe/bin/start_fe.sh --daemon
 /opt/doris/be/bin/start_be.sh --daemon
--- samples/datalake/deltalake_and_kudu/start-trinoconnector-compose.sh.orig
+++ samples/datalake/deltalake_and_kudu/start-trinoconnector-compose.sh
@@ -33,7 +33,6 @@
 hdfs_plugin="ff4a3e3b32dcce27f4df58f17938abde"
 kudu_java_example="1afe0a890785e8d0011ea7342ae5e43d"
 
-
 download_source_file() {
     local FILE_PATH="$1"
     local EXPECTED_MD5="$2"
@@ -79,9 +78,6 @@
 download_source_file "trino-hdfs-435-20240724.tar.gz" "${hdfs_plugin}" "https://github.com/apache/doris-thirdparty/releases/download/trino-435-20240724"
 download_source_file "kudu-java-example-1.0-SNAPSHOT.jar" "${kudu_java_example}" "https://github.com/apache/doris-thirdparty/releases/download/trino-435-20240724"
 
-
-
-
 if [[ ! -f "jdk-17.0.2/SUCCESS" ]]; then
     echo "Prepare jdk17 environment"
     if [[ -d "jdk-17.0.2" ]]; then
@@ -156,10 +152,9 @@
     touch connectors/trino-delta-lake-435/hdfs/SUCCESS
 fi
 
-
 cd ../
 
-export KUDU_QUICKSTART_IP=$(ifconfig | grep "inet " | grep -Fv 127.0.0.1 |  awk '{print $2}' | tail -1)
+export KUDU_QUICKSTART_IP=$(ifconfig | grep "inet " | grep -Fv 127.0.0.1 | awk '{print $2}' | tail -1)
 
 docker compose -f trinoconnector-compose.yml --env-file trinoconnector-compose.env up -d
 echo "Create hive table ..."
--- samples/datalake/deltalake_and_kudu/stop-trinoconnector-compose.sh.orig
+++ samples/datalake/deltalake_and_kudu/stop-trinoconnector-compose.sh
@@ -17,6 +17,6 @@
 # specific language governing permissions and limitations
 # under the License.
 
-export KUDU_QUICKSTART_IP=$(ifconfig | grep "inet " | grep -Fv 127.0.0.1 |  awk '{print $2}' | tail -1)
+export KUDU_QUICKSTART_IP=$(ifconfig | grep "inet " | grep -Fv 127.0.0.1 | awk '{print $2}' | tail -1)
 
 docker compose -f trinoconnector-compose.yml --env-file trinoconnector-compose.env down
----------

You can reformat the above files to meet shfmt's requirements by typing:

  shfmt  -w filename


@CalvinKirs CalvinKirs force-pushed the master-autorun-workflow branch from 0bab805 to 110f5ee Compare November 4, 2024 03:07
Copy link

github-actions bot commented Nov 4, 2024

sh-checker report

To get the full details, please check in the job output.

shellcheck errors

'shellcheck ' returned error 1 finding the following syntactical issues:

----------

In samples/datalake/deltalake_and_kudu/start-trinoconnector-compose.sh line 162:
export KUDU_QUICKSTART_IP=$(ifconfig | grep "inet " | grep -Fv 127.0.0.1 |  awk '{print $2}' | tail -1)
       ^----------------^ SC2155 (warning): Declare and assign separately to avoid masking return values.


In samples/datalake/deltalake_and_kudu/stop-trinoconnector-compose.sh line 20:
export KUDU_QUICKSTART_IP=$(ifconfig | grep "inet " | grep -Fv 127.0.0.1 |  awk '{print $2}' | tail -1)
       ^----------------^ SC2155 (warning): Declare and assign separately to avoid masking return values.

For more information:
  https://www.shellcheck.net/wiki/SC2155 -- Declare and assign separately to ...
----------

You can address the above issues in one of three ways:
1. Manually correct the issue in the offending shell script;
2. Disable specific issues by adding the comment:
  # shellcheck disable=NNNN
above the line that contains the issue, where NNNN is the error code;
3. Add '-e NNNN' to the SHELLCHECK_OPTS setting in your .yml action file.



shfmt errors

'shfmt ' returned error 1 finding the following formatting issues:

----------
--- samples/datalake/deltalake_and_kudu/scripts/start_doris.sh.orig
+++ samples/datalake/deltalake_and_kudu/scripts/start_doris.sh
@@ -20,8 +20,8 @@
 export JAVA_HOME=/opt/jdk-17.0.2
 
 cp -r /opt/doris-bin /opt/doris
-echo "trino_connector_plugin_dir=/opt/connectors/" >> /opt/doris/fe/conf/fe.conf
-echo "trino_connector_plugin_dir=/opt/connectors/" >> /opt/doris/be/conf/be.conf
+echo "trino_connector_plugin_dir=/opt/connectors/" >>/opt/doris/fe/conf/fe.conf
+echo "trino_connector_plugin_dir=/opt/connectors/" >>/opt/doris/be/conf/be.conf
 
 /opt/doris/fe/bin/start_fe.sh --daemon
 /opt/doris/be/bin/start_be.sh --daemon
--- samples/datalake/deltalake_and_kudu/start-trinoconnector-compose.sh.orig
+++ samples/datalake/deltalake_and_kudu/start-trinoconnector-compose.sh
@@ -33,7 +33,6 @@
 hdfs_plugin="ff4a3e3b32dcce27f4df58f17938abde"
 kudu_java_example="1afe0a890785e8d0011ea7342ae5e43d"
 
-
 download_source_file() {
     local FILE_PATH="$1"
     local EXPECTED_MD5="$2"
@@ -79,9 +78,6 @@
 download_source_file "trino-hdfs-435-20240724.tar.gz" "${hdfs_plugin}" "https://github.com/apache/doris-thirdparty/releases/download/trino-435-20240724"
 download_source_file "kudu-java-example-1.0-SNAPSHOT.jar" "${kudu_java_example}" "https://github.com/apache/doris-thirdparty/releases/download/trino-435-20240724"
 
-
-
-
 if [[ ! -f "jdk-17.0.2/SUCCESS" ]]; then
     echo "Prepare jdk17 environment"
     if [[ -d "jdk-17.0.2" ]]; then
@@ -156,10 +152,9 @@
     touch connectors/trino-delta-lake-435/hdfs/SUCCESS
 fi
 
-
 cd ../
 
-export KUDU_QUICKSTART_IP=$(ifconfig | grep "inet " | grep -Fv 127.0.0.1 |  awk '{print $2}' | tail -1)
+export KUDU_QUICKSTART_IP=$(ifconfig | grep "inet " | grep -Fv 127.0.0.1 | awk '{print $2}' | tail -1)
 
 docker compose -f trinoconnector-compose.yml --env-file trinoconnector-compose.env up -d
 echo "Create hive table ..."
--- samples/datalake/deltalake_and_kudu/stop-trinoconnector-compose.sh.orig
+++ samples/datalake/deltalake_and_kudu/stop-trinoconnector-compose.sh
@@ -17,6 +17,6 @@
 # specific language governing permissions and limitations
 # under the License.
 
-export KUDU_QUICKSTART_IP=$(ifconfig | grep "inet " | grep -Fv 127.0.0.1 |  awk '{print $2}' | tail -1)
+export KUDU_QUICKSTART_IP=$(ifconfig | grep "inet " | grep -Fv 127.0.0.1 | awk '{print $2}' | tail -1)
 
 docker compose -f trinoconnector-compose.yml --env-file trinoconnector-compose.env down
----------

You can reformat the above files to meet shfmt's requirements by typing:

  shfmt  -w filename


Copy link

github-actions bot commented Nov 4, 2024

sh-checker report

To get the full details, please check in the job output.

shellcheck errors

'shellcheck ' returned error 1 finding the following syntactical issues:

----------

In samples/datalake/deltalake_and_kudu/start-trinoconnector-compose.sh line 162:
export KUDU_QUICKSTART_IP=$(ifconfig | grep "inet " | grep -Fv 127.0.0.1 |  awk '{print $2}' | tail -1)
       ^----------------^ SC2155 (warning): Declare and assign separately to avoid masking return values.


In samples/datalake/deltalake_and_kudu/stop-trinoconnector-compose.sh line 20:
export KUDU_QUICKSTART_IP=$(ifconfig | grep "inet " | grep -Fv 127.0.0.1 |  awk '{print $2}' | tail -1)
       ^----------------^ SC2155 (warning): Declare and assign separately to avoid masking return values.

For more information:
  https://www.shellcheck.net/wiki/SC2155 -- Declare and assign separately to ...
----------

You can address the above issues in one of three ways:
1. Manually correct the issue in the offending shell script;
2. Disable specific issues by adding the comment:
  # shellcheck disable=NNNN
above the line that contains the issue, where NNNN is the error code;
3. Add '-e NNNN' to the SHELLCHECK_OPTS setting in your .yml action file.



shfmt errors

'shfmt ' returned error 1 finding the following formatting issues:

----------
--- samples/datalake/deltalake_and_kudu/scripts/start_doris.sh.orig
+++ samples/datalake/deltalake_and_kudu/scripts/start_doris.sh
@@ -20,8 +20,8 @@
 export JAVA_HOME=/opt/jdk-17.0.2
 
 cp -r /opt/doris-bin /opt/doris
-echo "trino_connector_plugin_dir=/opt/connectors/" >> /opt/doris/fe/conf/fe.conf
-echo "trino_connector_plugin_dir=/opt/connectors/" >> /opt/doris/be/conf/be.conf
+echo "trino_connector_plugin_dir=/opt/connectors/" >>/opt/doris/fe/conf/fe.conf
+echo "trino_connector_plugin_dir=/opt/connectors/" >>/opt/doris/be/conf/be.conf
 
 /opt/doris/fe/bin/start_fe.sh --daemon
 /opt/doris/be/bin/start_be.sh --daemon
--- samples/datalake/deltalake_and_kudu/start-trinoconnector-compose.sh.orig
+++ samples/datalake/deltalake_and_kudu/start-trinoconnector-compose.sh
@@ -33,7 +33,6 @@
 hdfs_plugin="ff4a3e3b32dcce27f4df58f17938abde"
 kudu_java_example="1afe0a890785e8d0011ea7342ae5e43d"
 
-
 download_source_file() {
     local FILE_PATH="$1"
     local EXPECTED_MD5="$2"
@@ -79,9 +78,6 @@
 download_source_file "trino-hdfs-435-20240724.tar.gz" "${hdfs_plugin}" "https://github.com/apache/doris-thirdparty/releases/download/trino-435-20240724"
 download_source_file "kudu-java-example-1.0-SNAPSHOT.jar" "${kudu_java_example}" "https://github.com/apache/doris-thirdparty/releases/download/trino-435-20240724"
 
-
-
-
 if [[ ! -f "jdk-17.0.2/SUCCESS" ]]; then
     echo "Prepare jdk17 environment"
     if [[ -d "jdk-17.0.2" ]]; then
@@ -156,10 +152,9 @@
     touch connectors/trino-delta-lake-435/hdfs/SUCCESS
 fi
 
-
 cd ../
 
-export KUDU_QUICKSTART_IP=$(ifconfig | grep "inet " | grep -Fv 127.0.0.1 |  awk '{print $2}' | tail -1)
+export KUDU_QUICKSTART_IP=$(ifconfig | grep "inet " | grep -Fv 127.0.0.1 | awk '{print $2}' | tail -1)
 
 docker compose -f trinoconnector-compose.yml --env-file trinoconnector-compose.env up -d
 echo "Create hive table ..."
--- samples/datalake/deltalake_and_kudu/stop-trinoconnector-compose.sh.orig
+++ samples/datalake/deltalake_and_kudu/stop-trinoconnector-compose.sh
@@ -17,6 +17,6 @@
 # specific language governing permissions and limitations
 # under the License.
 
-export KUDU_QUICKSTART_IP=$(ifconfig | grep "inet " | grep -Fv 127.0.0.1 |  awk '{print $2}' | tail -1)
+export KUDU_QUICKSTART_IP=$(ifconfig | grep "inet " | grep -Fv 127.0.0.1 | awk '{print $2}' | tail -1)
 
 docker compose -f trinoconnector-compose.yml --env-file trinoconnector-compose.env down
----------

You can reformat the above files to meet shfmt's requirements by typing:

  shfmt  -w filename


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.