Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hints do not work on subqueries with DISTINCT #187

Open
DontBreakAlex opened this issue Apr 29, 2024 · 1 comment
Open

Hints do not work on subqueries with DISTINCT #187

DontBreakAlex opened this issue Apr 29, 2024 · 1 comment

Comments

@DontBreakAlex
Copy link

I could not get some hint to work a bigger query and narrowed the issue to this pattern:

CREATE TABLE t1 (id int PRIMARY KEY, val int);
CREATE TABLE t2 (id int PRIMARY KEY, val int);
EXPLAIN
/*+
    MergeJoin(t1 t2)
    Rows(t1 t2 #1234)
 */
SELECT *
FROM (SELECT DISTINCT ON (id) *
      FROM t1) S
            INNER JOIN t2 ON S.id = t2.id;

QUERY PLAN
Hash Join  (cost=61.01..150.50 rows=2260 width=16)
  Hash Cond: (t1.id = t2.id)
  ->  Unique  (cost=0.15..83.71 rows=2260 width=8)
        ->  Index Scan using t1_pkey on t1  (cost=0.15..78.06 rows=2260 width=8)
  ->  Hash  (cost=32.60..32.60 rows=2260 width=8)
        ->  Seq Scan on t2  (cost=0.00..32.60 rows=2260 width=8)

Without the DISTINCT ON, the hints work fine:

EXPLAIN
/*+
    MergeJoin(t1 t2)
    Rows(t1 t2 #1234)
 */
SELECT *
FROM (SELECT *
      FROM t1) S
            INNER JOIN t2 ON S.id = t2.id;

QUERY PLAN
Merge Join  (cost=0.31..190.01 rows=1234 width=16)
  Merge Cond: (t1.id = t2.id)
  ->  Index Scan using t1_pkey on t1  (cost=0.15..78.06 rows=2260 width=8)
  ->  Index Scan using t2_pkey on t2  (cost=0.15..78.06 rows=2260 width=8)

Using ANY_subquery does not work

EXPLAIN
/*+
    MergeJoin(ANY_subquery t2)
    Rows(ANY_subquery t2 #1234)
 */
SELECT *
FROM (SELECT DISTINCT ON (id) *
      FROM t1) S
            INNER JOIN t2 ON S.id = t2.id;

QUERY PLAN
Hash Join  (cost=61.01..150.50 rows=2260 width=16)
  Hash Cond: (t1.id = t2.id)
  ->  Unique  (cost=0.15..83.71 rows=2260 width=8)
        ->  Index Scan using t1_pkey on t1  (cost=0.15..78.06 rows=2260 width=8)
  ->  Hash  (cost=32.60..32.60 rows=2260 width=8)
        ->  Seq Scan on t2  (cost=0.00..32.60 rows=2260 width=8)

Merge joins are possible, if I set ENABLE_HASHJOIN to OFF, I get:

QUERY PLAN
Merge Join  (cost=0.31..195.66 rows=2260 width=16)
  Merge Cond: (t1.id = t2.id)
  ->  Unique  (cost=0.15..83.71 rows=2260 width=8)
        ->  Index Scan using t1_pkey on t1  (cost=0.15..78.06 rows=2260 width=8)
  ->  Index Scan using t2_pkey on t2  (cost=0.15..78.06 rows=2260 width=8)

I may be missing something, but this looks like a bug.
I'm using PostgreSQL 16.2 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 7.3.1 20180712 (Red Hat 7.3.1-12), 64-bit on AWS RDS.

@michaelpq
Copy link
Collaborator

Yeah, this would be a new feature in itself. I don't think that anybody has bothered with this case yet, would you be interested in writing a patch?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants