[spark] Disable compaction for data evolution table #6342

juntaozhang · 2025-09-26T12:18:53Z

Purpose

Spark data evolution table can appear inconsistent before and after compaction. Example:

CREATE TABLE s (id INT, b INT);
INSERT INTO s VALUES (1, 11), (2, 22);

CREATE TABLE t (id INT, b INT, c INT) TBLPROPERTIES ('row-tracking.enabled' = 'true', 'data-evolution.enabled' = 'true');
INSERT INTO t VALUES (2, 2, 2), (3, 3, 3);
MERGE INTO t
USING s
ON t.id = s.id
WHEN MATCHED THEN UPDATE SET t.b = s.b
WHEN NOT MATCHED THEN INSERT (id, b, c) VALUES (id, b, 0);
select *, _ROW_ID, _SEQUENCE_NUMBER from t order by _ROW_ID asc;
CALL sys.compact(table => 't');
select *, _ROW_ID, _SEQUENCE_NUMBER from t order by _ROW_ID asc;

before compaction:

+----+----+---+---------+------------------+
| id |  b | c | _ROW_ID | _SEQUENCE_NUMBER |
+----+----+---+---------+------------------+
|  2 | 22 | 2 |       0 |                2 |
|  3 |  3 | 3 |       1 |                2 |
|  1 | 11 | 0 |       2 |                2 |
+----+----+---+---------+------------------+

after compaction:

+--------+----+--------+---------+------------------+
|     id |  b |      c | _ROW_ID | _SEQUENCE_NUMBER |
+--------+----+--------+---------+------------------+
| <NULL> | 22 | <NULL> |       0 |                2 |
|      2 |  2 |      2 |       0 |                1 |
| <NULL> |  3 | <NULL> |       1 |                2 |
|      3 |  3 |      3 |       1 |                1 |
|      1 | 11 |      0 |       2 |                2 |
+--------+----+--------+---------+------------------+

Disable compaction in Spark to align with Flink behavior (#6112).

Tests

API and Format

Documentation

[spark] Disable compaction for data evolution table

64d79a8

juntaozhang force-pushed the fix-disable-compact branch from 99da4e8 to 64d79a8 Compare September 28, 2025 03:07

[spark] Disable compaction for data evolution table - fix UT

10d8a53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[spark] Disable compaction for data evolution table #6342

[spark] Disable compaction for data evolution table #6342

Uh oh!

juntaozhang commented Sep 26, 2025

Uh oh!

Uh oh!

[spark] Disable compaction for data evolution table #6342

Are you sure you want to change the base?

[spark] Disable compaction for data evolution table #6342

Uh oh!

Conversation

juntaozhang commented Sep 26, 2025

Purpose

Tests

API and Format

Documentation

Uh oh!

Uh oh!