feat: hilbert clustering #17045

zhyass · 2024-12-12T13:44:18Z

I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/

Summary

This PR refines Hilbert clustering in Databend by adopting a range-based partitioning approach. It samples cluster keys, assigns range partition IDs, and calculates Hilbert indexes for efficient pruning and clustering. Key changes include:

Removing the old Hilbert clustering logic.
Stable segments are excluded from reclustering to preserve optimal clustering results.

mysql> create table t(a int, b int) cluster by hilbert(a,b);
Query OK, 0 rows affected (0.16 sec)

mysql> insert into t values(1,1),(2,2);
+-------------------------+
| number of rows inserted |
+-------------------------+
|                       2 |
+-------------------------+
1 row in set (0.13 sec)
Read 2 rows, 18.00 B in 0.079 sec., 25.31 rows/sec., 227.78 B/sec.

mysql> insert into t values(0,0),(3,3);
+-------------------------+
| number of rows inserted |
+-------------------------+
|                       2 |
+-------------------------+
1 row in set (0.12 sec)
Read 2 rows, 18.00 B in 0.066 sec., 30.31 rows/sec., 272.75 B/sec.

mysql> alter table t recluster;
Query OK, 4 rows affected (1.87 sec)

mysql> select * from hilbert_clustering_information('default','t');
+-------------+---------+----------------------------+---------------------+----------------------+-----------------------+---------------------------+
| cluster_key | type    | timestamp                  | total_segment_count | stable_segment_count | partial_segment_count | unclustered_segment_count |
+-------------+---------+----------------------------+---------------------+----------------------+-----------------------+---------------------------+
| (a, b)      | hilbert | 2025-01-07 13:10:30.105419 |                   1 |                    0 |                     1 |                         0 |
+-------------+---------+----------------------------+---------------------+----------------------+-----------------------+---------------------------+
1 row in set (0.07 sec)
Read 1 rows, 53.00 B in 0.047 sec., 21.36 rows/sec., 1.11 KiB/sec.

The alter table t recluster is equivalent to

WITH _keys_bound AS (
  SELECT 
    range_bound(1024, 1000)(a) AS a_bound, 
    range_bound(1024, 1000)(b) AS b_bound 
  FROM 
    default.t
), 
_source_data AS (
  SELECT 
    t.*, 
    hilbert_index(
      [hilbert_key(cast(ifnull(range_partition_id(t.a, _keys_bound.a_bound), 1023) as uint16)), hilbert_key(cast(ifnull(range_partition_id(t.b, _keys_bound.b_bound), 1023) as uint16))], 
      2
    ) AS _hilbert_index 
  FROM 
    default.t, 
    _keys_bound
) 
SELECT 
  * EXCLUDE(_hilbert_index) 
FROM 
  _source_data 
ORDER BY 
  _hilbert_index

Test

Prepare data.

create table test_source (
    id bigint not null,
    id1 bigint,
    id2 bigint,
    id3 bigint,
    id4 bigint,
    id5 bigint,
    id6 bigint,
    id7 bigint,
    
    s1 varchar,
    s2 varchar,
    s3 varchar,
    s4 varchar,
    s5 varchar,
    s6 varchar,
    s7 varchar,
    s8 varchar,
    s9 varchar,
    s10 varchar,
    s11 varchar,
    s12 varchar,
    s13 varchar,
    
    d1 DECIMAL(20, 8),
    d2 DECIMAL(20, 8),
    d3 DECIMAL(20, 8),
    d4 DECIMAL(20, 8),
    d5 DECIMAL(20, 8),
    d6 DECIMAL(30, 8),
    d7 DECIMAL(30, 8),
    d8 DECIMAL(30, 8),
    d9 DECIMAL(30, 8),
    d10 DECIMAL(30, 8),
    
    insert_time datetime  not null,
    insert_time1 datetime,
    insert_time2 datetime,
    insert_time3 datetime,
    
    i int
);

create table test_random like test_source Engine = Random;
create table test_hilbert like test_source cluster by hilbert(id, insert_time);
create table test_linear like test_source cluster by(id, insert_time);
-- create in old warehouse--
create table test_hilbert_old like test_source cluster by hilbert(id, insert_time);

test_hilbert with the new hilbert cluster type.
test_hilbert_old is the old hilbert cluster type table.
test_linear is the normal linear cluster type table.
Contains 600000000 rows.

zzq> select count() from test_linear;
┌───────────┐
│  count()  │
│   UInt64  │
├───────────┤
│ 600000000 │
└───────────┘
1 row read in 0.011 sec. Processed 1 row, 1 B (90.91 rows/s, 90 B/s)

Explain: (pruning stats, query duration, size)
1. id = 1004050502160506553
test_linear:
segments: <range pruning: 2 to 2>, blocks: <range pruning: 2211 to 12, bloom pruning: 12 to 1>
198ms 5.61MB
test_hilbert_old:
segments: <range pruning: 2 to 2>, blocks: <range pruning: 2246 to 45, bloom pruning: 45 to 2>
199ms 7.39MB
test_hilbert:
segments: <range pruning: 2 to 2>, blocks: <range pruning: 2388 to 79, bloom pruning: 79 to 1>
163ms 3.77MB

2. insert_time = '2024-11-27 18:48:25.619751'
test_linear:
**segments: <range pruning: 2 to 2>, blocks: <range pruning: 2211 to 2211, bloom pruning: 2211 to 12> **
431ms 49.02MB
test_hilbert_old:
**segments: <range pruning: 2 to 2>, blocks: <range pruning: 2246 to 884, bloom pruning: 884 to 7> **
331ms 29.31MB
test_hilbert:
segments: <range pruning: 2 to 2>, blocks: <range pruning: 2388 to 77, bloom pruning: 77 to 1>
238ms 3.73MB

3. id = 7190230217165929558 and insert_time = '2024-11-27 18:48:25.619751'
test_linear:
segments: <range pruning: 2 to 2>, blocks: <range pruning: 2211 to 14, bloom pruning: 14 to 1>
165ms 4.9MB
test_hilbert_old:
**segments: <range pruning: 2 to 2>, blocks: <range pruning: 2246 to 32, bloom pruning: 32 to 1> **
157ms 4.45MB
test_hilbert:
**segments: <range pruning: 2 to 2>, blocks: <range pruning: 2388 to 2, bloom pruning: 2 to 1> **
149ms 3.73MB

4. id = 7190230217165929558 and insert_time >= '2022-08-04 05:38:53.865252' and insert_time <= '2025-01-01 12:12:12.000000'
test_linear:
segments: <range pruning: 2 to 2>, blocks: <range pruning: 2211 to 14, bloom pruning: 14 to 1>
140ms 4.9MB
test_hilbear_old:
segments: <range pruning: 2 to 2>, blocks: <range pruning: 2246 to 32, bloom pruning: 32 to 1>
157ms 4.45MB
test_hilbert:
segments: <range pruning: 2 to 2>, blocks: <range pruning: 2388 to 2, bloom pruning: 2 to 1>
142ms 3.73MB

5. 5. id >= 0 and id <= 100000000000000 and insert_time >= '2022-08-04 05:38:53.865252' and insert_time <= '2025-01-01 12:12:12.000000'
test_linear:
segments: <range pruning: 2 to 2>, blocks: <range pruning: 2211 to 13>
338ms 59.84MB
test_hilbear_old:
**segments: <range pruning: 2 to 2>, blocks: <range pruning: 2246 to 18> **
348ms 76.53MB
test_hilbert:
segments: <range pruning: 2 to 2>, blocks: <range pruning: 2388 to 2>
189ms 7.49MB

6. insert_time >= '2022-08-04 05:38:53.865252' and insert_time <= '2025-01-01 12:12:12.000000'
test_linear:
segments: <range pruning: 2 to 2>, blocks: <range pruning: 2211 to 2211>
2.2s 566.16MB
test_hilbear_old:
segments: <range pruning: 2 to 2>, blocks: <range pruning: 2246 to 884>
1.1s 209.28MB
test_hilbert:
segments: <range pruning: 2 to 2>, blocks: <range pruning: 2388 to 77>
451ms 22.8MB

7. id >= 0 and id <= 100000000000000
test_linear:
segments: <range pruning: 2 to 2>, blocks: <range pruning: 2211 to 13>
337ms 59.84MB
test_hilbert_old:
segments: <range pruning: 2 to 2>, blocks: <range pruning: 2246 to 41>
715ms 172.72MB
test_hilbert:
segments: <range pruning: 2 to 2>, blocks: <range pruning: 2388 to 73>
1.1s 279.91MB

Conclusions
The optimized Hilbert Clustering demonstrates superior performance over Linear Clustering and the previous version of Hilbert Clustering in most scenarios, especially when query conditions involve cluster key columns other than the first one. However, its performance is slightly weaker when query conditions include only the first cluster key column.

This new version effectively resolves the performance degradation caused by uneven data distribution in the older Hilbert Clustering. A critical point to note is that the optimized Hilbert Clustering performs reclustering on a per-segment basis, potentially involving over 100GB of data, which significantly increases the likelihood of triggering sort spill. As a result, single recluster operations may take longer compared to the previous version, making it unsuitable for automatic reclustering immediately after data ingestion. However, the optimized version has a clear advantage in reducing execution time during the final reclustering phase compared to its predecessor.

It is important to note that although Linear Clustering also uses a localized recluster strategy, it processes max_threads * 4 segments at a time, whereas Hilbert Clustering handles only one segment per operation. This limitation in localized reclustering could become more pronounced under extreme conditions. Therefore, Hilbert Clustering is better suited for scenarios where data ingestion follows a certain order, such as chronological order.

Tests

Unit Test
Logic Test
Benchmark Test
No Test - Explain why

Type of change

Bug Fix (non-breaking change which fixes an issue)
New Feature (non-breaking change which adds functionality)
Breaking Change (fix or feature that could cause existing functionality not to work as expected)
Documentation Update
Refactoring
Performance Improvement
Other (please describe):

This change is

zhang2014 · 2024-12-31T12:49:23Z

Maybe should add performance test with hilbert clustering?

github-actions · 2025-01-07T04:38:56Z

Docker Image for PR

tag: pr-17045-3689543-1736224672

note: this image tag is only available for internal use,
please check the internal doc for more details.

src/query/functions/src/scalars/hilbert.rs

src/query/storages/fuse/src/operations/read_partitions.rs

src/query/ee/src/hilbert_clustering/handler.rs

zhyass marked this pull request as draft December 12, 2024 13:44

github-actions bot added the pr-feature this PR introduces a new feature to the codebase label Dec 12, 2024

zhyass force-pushed the feature_cluster_table branch 5 times, most recently from 88111a3 to 059bb42 Compare December 30, 2024 13:51

zhyass marked this pull request as ready for review December 31, 2024 07:03

zhyass requested review from zhang2014, dantengsky and sundy-li December 31, 2024 08:55

zhyass added the ci-cloud Build docker image for cloud test label Dec 31, 2024

zhyass added ci-cloud Build docker image for cloud test and removed ci-cloud Build docker image for cloud test labels Dec 31, 2024

zhyass force-pushed the feature_cluster_table branch 2 times, most recently from e5ea1e6 to 595cff9 Compare January 1, 2025 09:02

zhyass marked this pull request as draft January 1, 2025 17:09

zhyass force-pushed the feature_cluster_table branch from 6ba61fd to c0c2183 Compare January 2, 2025 04:25

zhyass added ci-cloud Build docker image for cloud test and removed ci-cloud Build docker image for cloud test labels Jan 2, 2025

zhyass force-pushed the feature_cluster_table branch 2 times, most recently from b336e03 to ffdc058 Compare January 2, 2025 16:43

zhyass marked this pull request as ready for review January 6, 2025 01:23

zhyass force-pushed the feature_cluster_table branch from 5d22310 to 933d544 Compare January 6, 2025 09:36

zhyass added ci-cloud Build docker image for cloud test and removed ci-cloud Build docker image for cloud test labels Jan 6, 2025

zhyass force-pushed the feature_cluster_table branch from 933d544 to 07a64c0 Compare January 7, 2025 02:38

zhyass added ci-cloud Build docker image for cloud test and removed ci-cloud Build docker image for cloud test labels Jan 7, 2025

databendlabs deleted a comment from github-actions bot Jan 7, 2025

zhyass added 9 commits January 9, 2025 11:50

hilbert_clustering

7c3323e

fix test

ab9b33b

vacuum temp files after recluster

307fbc5

fix

7f7ba67

add test

ced2a5f

fix

96109d5

fix

b7f1214

add hilbert_clustering_information

3a08558

update

3b6e513

zhyass force-pushed the feature_cluster_table branch from a99f13b to 3b6e513 Compare January 9, 2025 03:53

dantengsky reviewed Jan 9, 2025

View reviewed changes

src/query/functions/src/scalars/hilbert.rs Outdated Show resolved Hide resolved

dantengsky reviewed Jan 9, 2025

View reviewed changes

src/query/storages/fuse/src/operations/read_partitions.rs Show resolved Hide resolved

dantengsky reviewed Jan 9, 2025

View reviewed changes

src/query/ee/src/hilbert_clustering/handler.rs Show resolved Hide resolved

zhyass added 2 commits January 10, 2025 12:55

update

a784f86

add comments

ee81272

dantengsky enabled auto-merge January 11, 2025 04:27

dantengsky self-requested a review January 11, 2025 04:27

dantengsky approved these changes Jan 11, 2025

View reviewed changes

dantengsky added this pull request to the merge queue Jan 11, 2025

BohuTANG removed this pull request from the merge queue due to a manual request Jan 11, 2025

BohuTANG merged commit be6657d into databendlabs:main Jan 11, 2025
70 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: hilbert clustering #17045

feat: hilbert clustering #17045

zhyass commented Dec 12, 2024 •

edited

Loading

zhang2014 commented Dec 31, 2024

github-actions bot commented Jan 7, 2025

feat: hilbert clustering #17045

feat: hilbert clustering #17045

Conversation

zhyass commented Dec 12, 2024 • edited Loading

Summary

Tests

Type of change

zhang2014 commented Dec 31, 2024

github-actions bot commented Jan 7, 2025

Docker Image for PR

zhyass commented Dec 12, 2024 •

edited

Loading