Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ADBDEV-3999: ORCA generates incorrect plan with exists clause for partitioned table #576

Merged
merged 40 commits into from
Feb 9, 2024
Merged
Show file tree
Hide file tree
Changes from 20 commits
Commits
Show all changes
40 commits
Select commit Hold shift + click to select a range
376516f
ORCA generates incorrect plan with exists clause for partitioned table
RekGRpth Aug 2, 2023
048a7ee
Merge branch 'adb-6.x-dev' into ADBDEV-3999
RekGRpth Aug 4, 2023
029aa68
fix
RekGRpth Aug 28, 2023
115ab42
Merge branch 'adb-6.x-dev' into ADBDEV-3999
RekGRpth Aug 28, 2023
fc74d9c
another approach
RekGRpth Aug 29, 2023
640afd8
Merge branch 'adb-6.x-dev' into ADBDEV-3999
RekGRpth Aug 29, 2023
731c150
get first used column for exist subquery
RekGRpth Sep 6, 2023
a8911e5
Merge branch 'adb-6.x-dev' into ADBDEV-3999
RekGRpth Sep 6, 2023
4634f70
fix regress test
RekGRpth Sep 6, 2023
97a5c1e
fix
RekGRpth Sep 7, 2023
cd7b60a
comment
RekGRpth Sep 7, 2023
bd04aec
split
RekGRpth Sep 7, 2023
6a1811d
comment
RekGRpth Sep 7, 2023
416f58f
simplify
RekGRpth Sep 7, 2023
4b52a6c
mark first as used
RekGRpth Sep 21, 2023
3f97999
Merge branch 'adb-6.x-dev' into ADBDEV-3999
RekGRpth Sep 21, 2023
ebe72a2
Merge branch 'adb-6.x-dev' into ADBDEV-3999
RekGRpth Nov 2, 2023
0d26ab4
revork solution
RekGRpth Nov 2, 2023
1248b1b
Revert "revork solution"
RekGRpth Nov 3, 2023
bf51e90
Merge branch 'adb-6.x-dev' into ADBDEV-3999
RekGRpth Dec 26, 2023
4440325
comment expanded
RekGRpth Dec 29, 2023
ebaa3bd
remove first change and add assert
RekGRpth Dec 29, 2023
f5f1d2f
revert
RekGRpth Dec 29, 2023
31ae04a
Merge branch 'adb-6.x-dev' into ADBDEV-3999
RekGRpth Dec 29, 2023
e400238
revrite solution to fix partitioned table
RekGRpth Dec 29, 2023
5a5dc2e
Merge branch 'adb-6.x-dev' into ADBDEV-3999
RekGRpth Jan 10, 2024
3a0b08d
move logic from parent to child class
RekGRpth Jan 10, 2024
58f4503
clang format
RekGRpth Jan 10, 2024
78bc439
add bitmap and index scans to tests
RekGRpth Jan 12, 2024
6b51d33
fix bitmap test
RekGRpth Jan 15, 2024
107307c
Merge branch 'adb-6.x-dev' into ADBDEV-3999
RekGRpth Jan 15, 2024
f5ccf22
move logic from child to parent
RekGRpth Jan 16, 2024
ca7286b
revert
RekGRpth Jan 16, 2024
c9ee898
update tests
RekGRpth Jan 16, 2024
65678c2
fix tests
RekGRpth Jan 16, 2024
ac5ec4f
Merge branch 'adb-6.x-dev' into ADBDEV-3999
RekGRpth Jan 30, 2024
ee862db
ORCA generates incorrect plan with exists clause for partitioned table
RekGRpth Feb 5, 2024
2ff183a
Merge branch 'adb-6.x-dev' into ADBDEV-3999
RekGRpth Feb 5, 2024
219d9b3
Merge branch 'adb-6.x-dev' into ADBDEV-3999
RekGRpth Feb 6, 2024
8cfee82
Merge branch 'adb-6.x-dev' into ADBDEV-3999
andr-sokolov Feb 9, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions src/backend/gporca/libgpopt/include/gpopt/base/CColRefSet.h
Original file line number Diff line number Diff line change
Expand Up @@ -86,6 +86,9 @@ class CColRefSet : public CBitSet, public DbgPrintMixin<CColRefSet>
// return first member
CColRef *PcrFirst() const;

// return first used member
CColRef *PcrFirstUsed() const;

// include column
void Include(const CColRef *colref);

Expand Down
32 changes: 32 additions & 0 deletions src/backend/gporca/libgpopt/src/base/CColRefSet.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -126,6 +126,38 @@ CColRefSet::PcrFirst() const
return NULL;
}

//---------------------------------------------------------------------------
// @function:
// CColRefSet::PcrFirstUsed
//
// @doc:
// Return first used member
InnerLife0 marked this conversation as resolved.
Show resolved Hide resolved
//
//---------------------------------------------------------------------------
CColRef *
CColRefSet::PcrFirstUsed() const
{
CColRefSetIter crsi(*this);
while (crsi.Advance())
{
CColRef *pcr = crsi.Pcr();

if (CColRef::EUsed == pcr->GetUsage())
{
return pcr;
}
}

CColRef *pcr = PcrFirst();

if (NULL != pcr)
{
pcr->MarkAsUsed();
}

return pcr;
}

//---------------------------------------------------------------------------
// @function:
// CColRefSet::Include
Expand Down
10 changes: 6 additions & 4 deletions src/backend/gporca/libgpopt/src/xforms/CSubqueryHandler.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1300,8 +1300,9 @@ CSubqueryHandler::FCreateCorrelatedApplyForExistentialSubquery(
CExpression *pexprInner = (*pexprSubquery)[0];

// for existential subqueries, any column produced by inner expression
// can be used to check for empty answers; we use first column for that
CColRef *colref = pexprInner->DeriveOutputColumns()->PcrFirst();
// can be used to check for empty answers;
// we use first used (referenced in the query) column for that
CColRef *colref = pexprInner->DeriveOutputColumns()->PcrFirstUsed();

pexprInner->AddRef();
if (EsqctxtFilter == esqctxt)
Expand Down Expand Up @@ -1932,8 +1933,9 @@ CSubqueryHandler::FRemoveExistentialSubquery(
GPOS_ASSERT(EsqctxtFilter == esqctxt);

// for existential subqueries, any column produced by inner expression
// can be used to check for empty answers; we use first column for that
CColRef *colref = pexprInner->DeriveOutputColumns()->PcrFirst();
// can be used to check for empty answers;
// we use first used (referenced in the query) column for that
CColRef *colref = pexprInner->DeriveOutputColumns()->PcrFirstUsed();

if (COperator::EopScalarSubqueryExists == op_id)
{
Expand Down
49 changes: 49 additions & 0 deletions src/test/regress/expected/qp_correlated_query.out
Original file line number Diff line number Diff line change
Expand Up @@ -3910,6 +3910,55 @@ DROP TABLE skip_correlated_t3;
DROP TABLE skip_correlated_t4;
reset optimizer_join_order;
reset optimizer_trace_fallback;
--------------------------------------------------------------------------------
-- Ensure ORCA generates the correct plan with the exists clause
-- for the partitioned table.
--------------------------------------------------------------------------------
CREATE TABLE offers (
id int,
product int,
date date
) DISTRIBUTED BY (id);
INSERT INTO offers SELECT 0, 0, '2023-01-01';
CREATE TABLE contacts (
contact text,
id int,
date date
) DISTRIBUTED BY (id) PARTITION BY RANGE(date) (START (date '2023-01-01') INCLUSIVE END (date '2023-02-01') EXCLUSIVE EVERY (INTERVAL '1 month'));
NOTICE: CREATE TABLE will create partition "contacts_1_prt_1" for table "contacts"
INSERT INTO contacts SELECT '0', 0, '2023-01-01';
EXPLAIN (COSTS off, VERBOSE on)
SELECT id FROM offers WHERE EXISTS (
SELECT id FROM contacts WHERE product = 0 OR contacts.id = offers.id
);
QUERY PLAN
------------------------------------------------------------------------------------------------------------
Gather Motion 3:1 (slice3; segments: 3)
Output: offers.id
-> HashAggregate
Output: offers.id
Group Key: offers.ctid, offers.gp_segment_id
-> Redistribute Motion 3:3 (slice2; segments: 3)
Output: offers.id, offers.ctid, offers.gp_segment_id
Hash Key: offers.ctid
-> Nested Loop
Output: offers.id, offers.ctid, offers.gp_segment_id
Join Filter: ((offers.product = 0) OR (contacts_1_prt_1.id = offers.id))
-> Append
-> Seq Scan on qp_correlated_query.contacts_1_prt_1
Output: contacts_1_prt_1.id
-> Materialize
Output: offers.id, offers.product, offers.ctid, offers.gp_segment_id
-> Broadcast Motion 3:3 (slice1; segments: 3)
Output: offers.id, offers.product, offers.ctid, offers.gp_segment_id
-> Seq Scan on qp_correlated_query.offers
Output: offers.id, offers.product, offers.ctid, offers.gp_segment_id
Optimizer: Postgres query optimizer
Settings: optimizer=off
(22 rows)

DROP TABLE offers;
DROP TABLE contacts;
-- ----------------------------------------------------------------------
-- Test: teardown.sql
-- ----------------------------------------------------------------------
Expand Down
52 changes: 52 additions & 0 deletions src/test/regress/expected/qp_correlated_query_optimizer.out
Original file line number Diff line number Diff line change
Expand Up @@ -4050,6 +4050,58 @@ DROP TABLE skip_correlated_t3;
DROP TABLE skip_correlated_t4;
reset optimizer_join_order;
reset optimizer_trace_fallback;
--------------------------------------------------------------------------------
-- Ensure ORCA generates the correct plan with the exists clause
-- for the partitioned table.
--------------------------------------------------------------------------------
CREATE TABLE offers (
id int,
product int,
date date
) DISTRIBUTED BY (id);
INSERT INTO offers SELECT 0, 0, '2023-01-01';
CREATE TABLE contacts (
contact text,
id int,
date date
) DISTRIBUTED BY (id) PARTITION BY RANGE(date) (START (date '2023-01-01') INCLUSIVE END (date '2023-02-01') EXCLUSIVE EVERY (INTERVAL '1 month'));
NOTICE: CREATE TABLE will create partition "contacts_1_prt_1" for table "contacts"
INSERT INTO contacts SELECT '0', 0, '2023-01-01';
EXPLAIN (COSTS off, VERBOSE on)
SELECT id FROM offers WHERE EXISTS (
SELECT id FROM contacts WHERE product = 0 OR contacts.id = offers.id
);
QUERY PLAN
-------------------------------------------------------------------------------------------------------------
Result
Output: offers.id
-> Result
Output: offers.id, offers.product
Filter: (SubPlan 1)
-> Gather Motion 3:1 (slice1; segments: 3)
Output: offers.id, offers.product
-> Seq Scan on qp_correlated_query.offers
Output: offers.id, offers.product
SubPlan 1 (slice0)
-> Result
Output: contacts.id
Filter: ((offers.product = 0) OR (contacts.id = offers.id))
-> Materialize
Output: contacts.id
-> Gather Motion 3:1 (slice2; segments: 3)
Output: contacts.id
-> Sequence
Output: contacts.id
-> Partition Selector for contacts (dynamic scan id: 1)
Partitions selected: 1 (out of 1)
-> Dynamic Seq Scan on qp_correlated_query.contacts (dynamic scan id: 1)
Output: contacts.id
Optimizer: Pivotal Optimizer (GPORCA)
Settings: optimizer=on
(25 rows)

DROP TABLE offers;
DROP TABLE contacts;
-- ----------------------------------------------------------------------
-- Test: teardown.sql
-- ----------------------------------------------------------------------
Expand Down
2 changes: 0 additions & 2 deletions src/test/regress/expected/subselect_gp_optimizer.out
Original file line number Diff line number Diff line change
Expand Up @@ -509,7 +509,6 @@ explain SELECT * FROM csq_r WHERE exists (SELECT * FROM csq_f(csq_r.a));
SubPlan 1 (slice1; segments: 3)
-> Result (cost=0.00..0.00 rows=1 width=1)
-> Result (cost=0.00..0.00 rows=1 width=1)
-> Result (cost=0.00..0.00 rows=1 width=1)
InnerLife0 marked this conversation as resolved.
Show resolved Hide resolved
Optimizer: Pivotal Optimizer (GPORCA) version 2.55.21
(8 rows)

Expand All @@ -529,7 +528,6 @@ explain SELECT * FROM csq_r WHERE not exists (SELECT * FROM csq_f(csq_r.a));
SubPlan 1 (slice1; segments: 3)
-> Result (cost=0.00..0.00 rows=1 width=1)
-> Result (cost=0.00..0.00 rows=1 width=1)
-> Result (cost=0.00..0.00 rows=1 width=1)
Optimizer: Pivotal Optimizer (GPORCA) version 2.55.21
(8 rows)

Expand Down
22 changes: 22 additions & 0 deletions src/test/regress/sql/qp_correlated_query.sql
Original file line number Diff line number Diff line change
Expand Up @@ -834,6 +834,28 @@ DROP TABLE skip_correlated_t4;
reset optimizer_join_order;
reset optimizer_trace_fallback;

--------------------------------------------------------------------------------
-- Ensure ORCA generates the correct plan with the exists clause
-- for the partitioned table.
--------------------------------------------------------------------------------
CREATE TABLE offers (
id int,
product int,
date date
) DISTRIBUTED BY (id);
INSERT INTO offers SELECT 0, 0, '2023-01-01';
CREATE TABLE contacts (
contact text,
id int,
date date
) DISTRIBUTED BY (id) PARTITION BY RANGE(date) (START (date '2023-01-01') INCLUSIVE END (date '2023-02-01') EXCLUSIVE EVERY (INTERVAL '1 month'));
INSERT INTO contacts SELECT '0', 0, '2023-01-01';
EXPLAIN (COSTS off, VERBOSE on)
InnerLife0 marked this conversation as resolved.
Show resolved Hide resolved
SELECT id FROM offers WHERE EXISTS (
SELECT id FROM contacts WHERE product = 0 OR contacts.id = offers.id
);
DROP TABLE offers;
DROP TABLE contacts;
-- ----------------------------------------------------------------------
-- Test: teardown.sql
-- ----------------------------------------------------------------------
Expand Down
Loading