Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent query results with left joins #15

Open
yanmendes opened this issue Apr 3, 2020 · 5 comments
Open

Inconsistent query results with left joins #15

yanmendes opened this issue Apr 3, 2020 · 5 comments

Comments

@yanmendes
Copy link

I have added my own data to BigDAWG, however, it's not processing left joins correctly. As a matter of fact, it's outputting the same result as an inner join would. Evidence and how to reproduce below:

Issuing query via BigDAWG's interface:

curl -X POST -d \
"bdrel(SELECT e.id, w.name, e.name FROM entity as e LEFT JOIN workflow as w ON w.id = e.id)" http://localhost:8080/bigdawg/query

id      name    name
1       04-HelloWorld

Querying data directly (LEFT JOIN)

psql -d postgresql://postgres@localhost:5401/kepler -c \
"SELECT e.id, w.name as wname, e.name as ename FROM entity as e LEFT JOIN workflow as w ON w.id = e.id;"

 id |     wname     |                 ename                 
----+---------------+---------------------------------------
  1 | 04-HelloWorld | 
  2 |               | .entityId
  3 |               | .SDF Director
  4 |               | .SDF Director.iterations
  5 |               | .SDF Director.vectorizationFactor
  6 |               | .SDF Director.allowDisconnectedGraphs
  7 |               | .SDF Director.allowRateChanges
  8 |               | .SDF Director.constrainBufferSizes
  9 |               | .SDF Director.period
 10 |               | .SDF Director.synchronizeToRealTime
 11 |               | .SDF Director.timeResolution
 12 |               | .SDF Director.entityId
 13 |               | .Annotation

Querying the data directly (INNER JOIN)

psql -d postgresql://postgres@localhost:5401/kepler -c \
"SELECT e.id, w.name as wname, e.name as ename FROM entity as e INNER JOIN workflow as w ON w.id = e.id;"

 id |     wname     | ename 
----+---------------+-------
  1 | 04-HelloWorld |

If you need a reproducible environment, just follow the installation steps here

@mmucklo
Copy link
Contributor

mmucklo commented Apr 22, 2020

@yanmendes Update, I've successfully reproduced this on my own using a couple of test tables.

It's basically stripping the LEFT JOIN syntax for my and turning it into an INNER JOIN using the syntax without the JOIN keywords.

My test query:

bdrel(select * from mm.test a left join mm.test2 b on a.id = b.test_id;)

Bigdawg is doing the following query:

SELECT a.id, a.name, b.id, b.test_id, b.description FROM mm.test AS a, mm.test2 AS b WHERE a.id = b.test_id;

I have to deep dive into the SQL parser to find out where this loss of fidelity is going on.

@mmucklo
Copy link
Contributor

mmucklo commented Apr 24, 2020

@yanmendes

I've made some modifications and I now have a version that now successfully handle's LEFT JOIN clauses.

Remaining issue(s):
It still has problems with what postgres calls "Anti Joins" of which one example would be this query (given two tables mm.test and mm.test2).

SELECT a.id, a.name, b.id, b.test_id, b.description FROM mm.test AS a LEFT JOIN mm.test2 AS b ON a.id = b.test_id where b.test_id is null;

The overarching issue is that Bigdawg uses the Postgres planner to determine its SQL plan, which is in part a saving grace of not having to implement such a thing itself (caveats being it's done against the Schema tables which don't have rows, and hence no statistics so the plan will necessarily be optimized for an empty table case), however once the plan is generated internally to Bigdawg, it has to now (re)assemble a query from the plan.

While we know it's an Anti Join, it would be useful in order to reverse generate the original query to know that it's a left join (which is not known from the EXPLAIN statement).

Hence some further logic will probably need to be made to interpret the Anti Join.

This is the explain run:

EXPLAIN (VERBOSE ON, COSTS OFF, FORMAT XML) SELECT a.id, a.name, b.id, b.test_id, b.description FROM mm.test AS a LEFT JOIN mm.test2 AS b ON a.id = b.test_id where b.test_id is null;

This is the query plan returned (notice Join-Type):

 <explain xmlns="http://www.postgresql.org/2009/explain">
   <Query>
     <Plan>
       <Node-Type>Hash Join</Node-Type>
       <Join-Type>Anti</Join-Type>
       <Output>
         <Item>a.id</Item>
         <Item>a.name</Item>
         <Item>b.id</Item>
         <Item>b.test_id</Item>
         <Item>b.description</Item>
       </Output>
       <Hash-Cond>(a.id = b.test_id)</Hash-Cond>
       <Plans>
         <Plan>
           <Node-Type>Seq Scan</Node-Type>
           <Parent-Relationship>Outer</Parent-Relationship>
           <Relation-Name>test</Relation-Name>
           <Schema>mm</Schema>
           <Alias>a</Alias>
           <Output>
             <Item>a.id</Item>
             <Item>a.name</Item>
           </Output>
         </Plan>
         <Plan>
           <Node-Type>Hash</Node-Type>
           <Parent-Relationship>Inner</Parent-Relationship>
           <Output>
             <Item>b.id</Item>
             <Item>b.test_id</Item>
             <Item>b.description</Item>
           </Output>
           <Plans>
             <Plan>
               <Node-Type>Seq Scan</Node-Type>
               <Parent-Relationship>Outer</Parent-Relationship>
               <Relation-Name>test2</Relation-Name>
               <Schema>mm</Schema>
               <Alias>b</Alias>
               <Output>
                 <Item>b.id</Item>
                 <Item>b.test_id</Item>
                 <Item>b.description</Item>
               </Output>
             </Plan>
           </Plans>
         </Plan>
       </Plans>
     </Plan>
   </Query>
 </explain>

As an alternative a query plan generated from a pure Left join looks like this (notice the Join-Type node):

EXPLAIN (VERBOSE ON, COSTS OFF, FORMAT XML) SELECT a.id, a.name, b.id, b.test_id, b.description FROM mm.test AS a LEFT JOIN mm.test2 AS b ON a.id = b.test_id;
 <explain xmlns="http://www.postgresql.org/2009/explain">
   <Query>
     <Plan>
       <Node-Type>Hash Join</Node-Type>
       <Join-Type>Left</Join-Type>
       <Output>
         <Item>a.id</Item>
         <Item>a.name</Item>
         <Item>b.id</Item>
         <Item>b.test_id</Item>
         <Item>b.description</Item>
       </Output>
       <Hash-Cond>(a.id = b.test_id)</Hash-Cond>
       <Plans>
         <Plan>
           <Node-Type>Seq Scan</Node-Type>
           <Parent-Relationship>Outer</Parent-Relationship>
           <Relation-Name>test</Relation-Name>
           <Schema>mm</Schema>
           <Alias>a</Alias>
           <Output>
             <Item>a.id</Item>
             <Item>a.name</Item>
           </Output>
         </Plan>
         <Plan>
           <Node-Type>Hash</Node-Type>
           <Parent-Relationship>Inner</Parent-Relationship>
           <Output>
             <Item>b.id</Item>
             <Item>b.test_id</Item>
             <Item>b.description</Item>
           </Output>
           <Plans>
             <Plan>
               <Node-Type>Seq Scan</Node-Type>
               <Parent-Relationship>Outer</Parent-Relationship>
               <Relation-Name>test2</Relation-Name>
               <Schema>mm</Schema>
               <Alias>b</Alias>
               <Output>
                 <Item>b.id</Item>
                 <Item>b.test_id</Item>
                 <Item>b.description</Item>
               </Output>
             </Plan>
           </Plans>
         </Plan>
       </Plans>
     </Plan>
   </Query>
 </explain>

I think I can take an Anti join and possibly interpret it as a left join adding the appropriate where condition, but I'll have to do a little testing.

The flip side is I'm not sure an Anti join works either with the current code base, actually I'm certain it doesn't.

mmucklo added a commit to mmucklo/bigdawg that referenced this issue Apr 24, 2020
@mmucklo
Copy link
Contributor

mmucklo commented May 5, 2020

Okay so I've punted on the Anti Join issue, and started taking a look at what happens when a distributed query is processed (across two different servers).

Unfortunately it too drops the "LEFT" keyword off the ensuing join.

I've done some limited debugging using log statements and it's deep in the query traversing and 'optimizing' section.

It's going to be very useful to get a debugger running on this one, however it's tricky since it's a distributed join requiring a migration from the secondary server to be done first before the query is executed and running things locally everything shows up under a single IP (meaning I can't effectively do migrations from one server to another).

Although I can get Bigdawg running locally fine, the problem I have is getting an independent host (or maybe 2) setup for the migration test. Since I'm doing this work on Windows I can't assign the docker containers IP addresses (unsupported), so I've resorted to creating at least one separate VM to run the other server on independently. I'm part way through the process now, but I've only been able to devote bits of time here and there to it.

I could make Bigdawg more docker friendly by allowing the other copies of Bigdawg to have different ports, and then exposing those ports on docker, and storing them in the Catalog, but this is probably a bigger project than it's worth right now and only helps in this narrow debugging case for the moment (although perhaps a useful feature later).

Alternatively I may reinstall linux on this machine which I've had on here in the past, which has helped, and then be able to debug a lot easier if I can assign the containers independent IPs.

Anyway hopefully with in the next few days or so I'll have an update.

@mmucklo
Copy link
Contributor

mmucklo commented May 8, 2020

Ok, I got my test environment fully setup tonight, finally, and was able to debug then fix at least part of the issue with distributed joins - the LEFT and RIGHT keywords should not be dropped anymore for simple distributed joins.

I have to do more testing with complex joins as there appears to be additional edge cases in the code when joins are permuted where the type of join is ignored.

mmucklo added a commit to mmucklo/bigdawg that referenced this issue May 11, 2020
mmucklo added a commit to mmucklo/bigdawg that referenced this issue May 14, 2020
…n child nodes are likewise flipped, fix for Anti joins.
mmucklo added a commit to mmucklo/bigdawg that referenced this issue May 23, 2020
@mmucklo
Copy link
Contributor

mmucklo commented May 23, 2020

Ok, I've addressed now Semi and Anti joins. There are certainly edge cases that probably don't work right given how the code paths are, but the basics should be good.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants