[Bug] Error detected by sqlancer #594

shmiwy · 2024-08-23T08:59:25Z

Cloudberry Database version

No response

What happened

The last two selects should get the same result, but they are not

What you think should happen instead

No response

How to reproduce

CREATE TABLE IF NOT EXISTS t0(c0 FLOAT DEFAULT (0.64844320698434) UNIQUE PRIMARY KEY NOT NULL, c1 inet DEFAULT ('228.195.152.147') NOT NULL) WITH (autovacuum_vacuum_cost_delay=72);
CREATE TABLE IF NOT EXISTS t1(LIKE t0);
CREATE TABLE t5(LIKE t1);

INSERT INTO t1(c0, c1) VALUES(0.9445783, '152.175.55.223');

INSERT INTO t0(c1, c0) VALUES('86.163.150.122', 0.51151264), ('153.68.173.244', 1.08803891E9), ('201.139.35.173', 0.032230698);

DELETE FROM ONLY t5 RETURNING + (((+ (abs(-1165130706)))+(CAST(((TRUE)AND(FALSE)) AS INT))));
INSERT INTO t5(c1, c0) VALUES('66.75.211.162', 0.4240951), ('10.91.215.127', - (pg_backend_pid())), ('1.78.54.190', 0.7707819) ON CONFLICT DO NOTHING;

INSERT INTO t1(c0, c1) VALUES(0.98276824, '230.228.200.54') ON CONFLICT DO NOTHING;
INSERT INTO t5(c1, c0) VALUES('173.7.80.33', 0.5569468);

INSERT INTO t0(c0) VALUES(2.26024944E8);

INSERT INTO t5(c0, c1) VALUES(0.51061195, '168.211.249.233');

INSERT INTO t5(c0, c1) OVERRIDING USER VALUE VALUES(0.48478556, '175.136.165.46');
INSERT INTO t0(c0) VALUES(0.9292619);

INSERT INTO t0(c0) VALUES(num_nonnulls(((((1349498262)/(839490868)))*(num_nulls(CAST(0.2903752 AS MONEY), CAST(0.83137906 AS MONEY), '46.127.254.133', B'1111111111111111111111111111111111101111010101111010101001111111', 1473917725))))), (-9.3304762E8), (0.042748928), (0.5439545), (0.0026354373);

INSERT INTO t1(c1, c0) VALUES('187.156.97.166', 0.9357929);

INSERT INTO t0(c0) VALUES(0.32275215);
SELECT t1.c0, t1.c1, t5.c0, t5.c1 FROM t1 FULL OUTER JOIN t5 ON (('}n()')LIKE(CAST(((0.7999568)IS DISTINCT FROM(0.53532124)) AS VARCHAR(100)))) WHERE NOT (((t1.c1)>=(t1.c1))) UNION ALL SELECT t1.c0, t1.c1, t5.c0, t5.c1 FROM t1* FULL OUTER JOIN t5* ON (('}n()')LIKE(CAST(((0.7999568)IS DISTINCT FROM(0.53532124)) AS VARCHAR(100)))) WHERE NOT (NOT (((t1.c1)>=(t1.c1)))) UNION ALL SELECT t1.c0, t1.c1, t5.c0, t5.c1 FROM t1* FULL OUTER JOIN t5* ON (('}n()')LIKE((((0.7999568)IS DISTINCT FROM(0.53532124)))::VARCHAR(100))) WHERE (NOT (((t1.c1)>=(t1.c1)))) ISNULL;

SELECT ALL t1.c0, t1.c1, t5.c0, t5.c1 FROM t1 FULL OUTER JOIN t5* ON (('}n()')LIKE((((0.7999568)IS DISTINCT FROM(0.53532124)))::VARCHAR(100)));

Operating System

ubuntu22

Anything else

No response

Are you willing to submit PR?

Yes, I am willing to submit a PR!

Code of Conduct

I agree to follow this project's Code of Conduct.

gfphoenix78 · 2024-09-27T03:18:23Z

A minimal repro is:

create table t1(a int, b int not null);
create table t2(like t1);
insert into t1 select 1, i from generate_series(1,3)i;
insert into t2 select 1, i from generate_series(4,6)i;

set optimizer=on;
explain select t1.*, t2.* from t1 full join t2 on false where (t1.b < t1.b) is null;
select t1.*, t2.* from t1 full join t2 on false where (t1.b < t1.b) is null;

not null is essential for column b to repro this bug.

roseduan · 2024-10-11T02:03:47Z

I find a simialr PR for this issue in gpdb, commit id 30cfe889e95dd78c160a0d855dba5d6125ca8bc4

Seems like it is a related PR with this problem.
BTW, gpdb has no such problem.

my-ship-it · 2024-11-08T01:22:53Z

@fanfuxiaoran Please help have a look, thanks!

fanfuxiaoran · 2024-11-08T09:50:19Z

Have stepped into the query, found :

explain (verbose)  SELECT t1.c0, t1.c1, t5.c0, t5.c1 FROM t1* FULL OUTER JOIN t5* ON (('}n()')LIKE((((0.7999568)IS DISTINCT FROM(0.53532124)))::VARCHAR(100)))
and  (NOT (((t1.c1)>=(t1.c1)))) ISNULL;
                                          QUERY PLAN
-----------------------------------------------------------------------------------------------
 Merge Full Join  (cost=10000000000.00..10027262749.67 rows=46700 width=80)
   Output: t1.c0, t1.c1, t5.c0, t5.c1
   Join Filter: false
   ->  Gather Motion 3:1  (slice1; segments: 3)  (cost=0.00..812.33 rows=46700 width=40)
         Output: t1.c0, t1.c1
         ->  Seq Scan on public.t1  (cost=0.00..189.67 rows=15567 width=40)
               Output: t1.c0, t1.c1
   ->  Materialize  (cost=0.00..929.08 rows=46700 width=40)
         Output: t5.c0, t5.c1
         ->  Gather Motion 3:1  (slice2; segments: 3)  (cost=0.00..812.33 rows=46700 width=40)
               Output: t5.c0, t5.c1
               ->  Seq Scan on public.t5  (cost=0.00..189.67 rows=15567 width=40)
                     Output: t5.c0, t5.c1
 Optimizer: Postgres query optimizer
(14 rows)

the commit 30cfe88 from gpdb only can handle the query which contains FALSE on join condition .

explain (verbose)  SELECT t1.c0, t1.c1, t5.c0, t5.c1 FROM t1* FULL OUTER JOIN t5* ON (('}n()')LIKE((((0.7999568)IS DISTINCT FROM(0.53532124)))::VARCHAR(100)))
WHERE (NOT (((t1.c1)>=(t1.c1)))) ISNULL;
                                    QUERY PLAN
----------------------------------------------------------------------------------
 Result  (cost=0.00..0.00 rows=0 width=32)
   Output: NULL::double precision, NULL::inet, NULL::double precision, NULL::inet
   One-Time Filter: false
 Optimizer: Pivotal Optimizer (GPORCA)
(4 rows)

But for the above query, the ISNULL is in where condition, it runs uncorrectly.

I think ISNULL predicate should not be pushed down to relation scan when existing FULL OUTER JOIN

Will continue to figure out how to fix this .

fanfuxiaoran · 2024-11-21T03:13:38Z

Found the root cause:
for the below query

create table t1(a int, b int not null);
create table t2(like t1);
select t1.*, t2.* from t1 full join t2 on false where (t1.b < t1.b) is null;

The original Algebrized query is as below

Algebrized query:
+--CLogicalSelect
   |--CLogicalFullOuterJoin
   |  |--CLogicalGet "t1" ("t1"), Columns: ["a" (0), "b" (1), "ctid" (2), "xmin" (3), "cmin" (4), "xmax" (5), "cmax" (6), "tableoid" (7), "gp_segment_id" (8), "gp_foreign_server" (9)] Key sets: {[2,8]}
   |  |--CLogicalGet "t2" ("t2"), Columns: ["a" (10), "b" (11), "ctid" (12), "xmin" (13), "cmin" (14), "xmax" (15), "cmax" (16), "tableoid" (17), "gp_segment_id" (18), "gp_foreign_server" (19)] Key sets: {[2,8]}
   |  +--CScalarConst (0)
   +--CScalarNullTest
      +--CScalarCmp (<)
         |--CScalarIdent "b" (1)
         +--CScalarIdent "b" (1)

After PexprEliminateSelfComparison called (used to eliminate self comparisons)
the Algebrized query is :

--CLogicalSelect
   |--CLogicalFullOuterJoin
   |  |--CLogicalGet "t1" ("t1"), Columns: ["a" (0), "b" (1), "ctid" (2), "xmin" (3), "cmin" (4), "xmax" (5), "cmax" (6), "tableoid" (7), "gp_segment_id" (8), "gp_foreign_server" (9)] Key sets: {[2,8]}
   |  |--CLogicalGet "t2" ("t2"), Columns: ["a" (10), "b" (11), "ctid" (12), "xmin" (13), "cmin" (14), "xmax" (15), "cmax" (16), "tableoid" (17), "gp_segment_id" (18), "gp_foreign_server" (19)] Key sets: {[2,8]}
   |  +--CScalarConst (0)
   +--CScalarNullTest
      +--CScalarConst (0)

as we can see that

+--CScalarCmp (<)
         |--CScalarIdent "b" (1)
         +--CScalarIdent "b" (1)

has been transformed into CScalarConst (0), this step is wrong.

The cause is that when checking if the selfcomparison can be simplified in function FSelfComparison, it checks the CColRef IsNullable only from the column definition. Not checking if the column is from outer join.

fanfuxiaoran · 2024-11-21T03:21:32Z

#722 the pr has fixed this issue

However, the orca cannot generate a plan for it. Even the column defined as nullable,
orca failed to generate a plan for the query above currently. But it's another issue.
Have created an issue for it: #723

shmiwy added the type: Bug Something isn't working label Aug 23, 2024

my-ship-it added the good first issue Good for newcomers label Aug 26, 2024

gfphoenix78 added the type: Orca only orca has the issue label Sep 27, 2024

gfphoenix78 added the priority: High After critical issues are fixed, these should be dealt with before any further issues. label Sep 27, 2024

my-ship-it assigned fanfuxiaoran Nov 8, 2024

fanfuxiaoran linked a pull request Nov 21, 2024 that will close this issue

WIP: ORCA: Fix eliminate self comparison #722

Open

12 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] Error detected by sqlancer #594

[Bug] Error detected by sqlancer #594

shmiwy commented Aug 23, 2024

gfphoenix78 commented Sep 27, 2024

roseduan commented Oct 11, 2024 •

edited

Loading

my-ship-it commented Nov 8, 2024

fanfuxiaoran commented Nov 8, 2024 •

edited

Loading

fanfuxiaoran commented Nov 21, 2024 •

edited

Loading

fanfuxiaoran commented Nov 21, 2024 •

edited

Loading

[Bug] Error detected by sqlancer #594

[Bug] Error detected by sqlancer #594

Comments

shmiwy commented Aug 23, 2024

Cloudberry Database version

What happened

What you think should happen instead

How to reproduce

Operating System

Anything else

Are you willing to submit PR?

Code of Conduct

gfphoenix78 commented Sep 27, 2024

roseduan commented Oct 11, 2024 • edited Loading

my-ship-it commented Nov 8, 2024

fanfuxiaoran commented Nov 8, 2024 • edited Loading

fanfuxiaoran commented Nov 21, 2024 • edited Loading

fanfuxiaoran commented Nov 21, 2024 • edited Loading

roseduan commented Oct 11, 2024 •

edited

Loading

fanfuxiaoran commented Nov 8, 2024 •

edited

Loading

fanfuxiaoran commented Nov 21, 2024 •

edited

Loading

fanfuxiaoran commented Nov 21, 2024 •

edited

Loading