Sync up behavior on how partition maintenance works between Greenplum 6/7 and Cloudberry #691
Replies: 3 comments 1 reply
-
Hi @lmugnano4537, it's a known behavior. For migration compatibility, seems it's better to keep the old behavior. @gfphoenix78, please investigate more on the behavior difference. |
Beta Was this translation helpful? Give feedback.
-
Hi, @lmugnano4537 , I'll take a look for this issue, do you run cloudberry on the main branch? |
Beta Was this translation helpful? Give feedback.
-
I've opened a PR to track this issue. The child partition table should inherits access method, reloptions from its parents. |
Beta Was this translation helpful? Give feedback.
-
Description
Found a difference in behavior that could have a big impact for customers migrating from Greenplum to Cloudberry. Worse still is that it would be difficult to notice and is a subtle change (ie. it doesn't fail but changes behavior). What is happening is that when I create a partitioned table with storage options like appendoptimized=true, compression, columnar, etc and then later add a new partition to the table.
In Greenplum, the add partition inherits the storage access method info from the parent table
In Cloudberry, the add partition will just create the partition as Heap unless you fully specify the storage and access methods
Example:
Create partitioned table with appendoptimized and compression:
CREATE TABLE star.trd_fct (
trd_date_key date NOT NULL ,
prod_key integer NOT NULL ,
cust_key integer NOT NULL ,
notional_val numeric(20,10) ,
num_of_shares integer ,
num_of_trades integer ,
trd_id bigint NOT NULL ,
ins_ts timestamp DEFAULT CURRENT_TIMESTAMP NOT NULL
)
WITH (
appendoptimized=true
,orientation=row
,compresstype=zlib
,compresslevel=5
)
DISTRIBUTED RANDOMLY
PARTITION BY RANGE (trd_date_key) (
START ('2020-01-01'::date) INCLUSIVE
END ('2027-01-01'::date) EXCLUSIVE
EVERY (interval '1 Year')
)
;
alter table star.trd_fct add partition start ('2027-01-01') inclusive end ('2028-01-01') exclusive;
In Cloudberry, new partition added as:
star_demo=# \d+ star.trd_fct_1_prt_11
Table "star.trd_fct_1_prt_11"
Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description
---------------+-----------------------------+-----------+----------+-------------------+---------+-------------+--------------+-------------
trd_date_key | date | | not null | | plain | | |
prod_key | integer | | not null | | plain | | |
cust_key | integer | | not null | | plain | | |
notional_val | numeric(20,10) | | | | main | | |
num_of_shares | integer | | | | plain | | |
num_of_trades | integer | | | | plain | | |
trd_id | bigint | | not null | | plain | | |
ins_ts | timestamp without time zone | | not null | CURRENT_TIMESTAMP | plain | | |
Partition of: star.trd_fct FOR VALUES FROM ('2027-01-01') TO ('2028-01-01')
Partition constraint: ((trd_date_key IS NOT NULL) AND (trd_date_key >= '2027-01-01'::date) AND (trd_date_key < '2028-01-01'::date))
Distributed randomly
Access method: heap
In Greenplum:
star_demo=# \d+ star.trd_fct_1_prt_11
Table "star.trd_fct_1_prt_11"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
---------------+-----------------------------+-----------+----------+-------------------+---------+--------------+-------------
trd_date_key | date | | not null | | plain | |
prod_key | integer | | not null | | plain | |
cust_key | integer | | not null | | plain | |
notional_val | numeric(20,10) | | | | main | |
num_of_shares | integer | | | | plain | |
num_of_trades | integer | | | | plain | |
trd_id | bigint | | not null | | plain | |
ins_ts | timestamp without time zone | | not null | CURRENT_TIMESTAMP | plain | |
Partition of: star.trd_fct FOR VALUES FROM ('2027-01-01') TO ('2028-01-01')
Partition constraint: ((trd_date_key IS NOT NULL) AND (trd_date_key >= '2027-01-01'::date) AND (trd_date_key < '2028-01-01'::date))
Distributed randomly
Access method: ao_row
Options: compresstype=zlib, compresslevel=5, blocksize=32768, checksum=true
Use case/motivation
Since many customers might have existing scripts that don't specify the storage and access method information when adding partitions I would like Cloudberry to work like Greenplum and default to using the parents options (it still should honor the options if the user did set them specifically on the partition but should default to what the table was created with)
Related issues
Didn't see anything related to this
Are you willing to submit a PR?
Beta Was this translation helpful? Give feedback.
All reactions