Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove the schema prefix when we do joins #10

Closed
wants to merge 1 commit into from

Conversation

mpatou
Copy link
Contributor

@mpatou mpatou commented Mar 2, 2024

Summary

On the columns that are part of the clause for a join, remove the schema
prefix by creating an alias based on the table and setting the table
attribute of the columns in the join clause to be alias and not the
table.
The name of the alias is picked to be the name of the table as well as
it works in plain SQL.

We only do that if:

  1. there is a schema prefix on the table (ie. commons or production)
  2. the "table" is not actually already an alias
  3. the "table" is not actually a subquery

Testing

I created unit tests for the visit_join function they are all passing.
I also created a wheel package and uploaded it in superset, I used to
have issues with queries when superset wanted to do a self join to limit
the number of series returned in a query.
It was complaining:

Relation name `commons` not found. If you are trying to access a nested
field within an object

With the fix the query is working fine.

@mpatou
Copy link
Contributor Author

mpatou commented Mar 2, 2024

I also tried the tests without the fix and I would get error like :

'"s1"."t1" JOIN "s2"."t2" ON "x" = "s2"."t2"."y"' != '"s1"."t1" JOIN "s2"."t2" ON "x" = "t2"."y"'

@mpatou mpatou requested review from dhruba, jhochmuth and haneeshr March 2, 2024 03:50
@mpatou mpatou force-pushed the fix_joins_schema branch from 981266f to 2d9772e Compare March 4, 2024 20:05
@mpatou mpatou changed the base branch from add_ci_github to main March 4, 2024 20:05
@mpatou mpatou force-pushed the fix_joins_schema branch from 2d9772e to 8530127 Compare March 4, 2024 20:16
@mpatou mpatou changed the base branch from main to redo_github_ci March 4, 2024 20:16
@mpatou mpatou force-pushed the fix_joins_schema branch from 8530127 to 5a0828c Compare March 4, 2024 20:25
@mpatou mpatou force-pushed the fix_joins_schema branch from 5a0828c to d40f65a Compare March 4, 2024 20:28
Summary

On the columns that are part of the clause for a join, remove the schema
prefix by creating an alias based on the table and setting the `table`
attribute of the columns in the join clause to be alias and not the
table.
The name of the alias is picked to be the name of the table as well as
it works in plain SQL.

We only do that if:

1. there is a schema prefix on the table (ie. `commons` or `production`)
2. the "table" is not actually already an alias
3. the "table" is not actually a subquery

Testing

I created unit tests for the visit_join function they are all passing.
I also created a wheel package and uploaded it in `superset`, I used to
have issues with queries when superset wanted to do a self join to limit
the number of series returned in a query.
It was complaining:
```
Relation name `commons` not found. If you are trying to access a nested
field within an object
```
With the fix the query is working fine.
@mpatou mpatou force-pushed the fix_joins_schema branch from d40f65a to c577823 Compare March 4, 2024 20:31
@mpatou mpatou changed the base branch from redo_github_ci to fix_formatting March 4, 2024 20:31
@mpatou mpatou deleted the branch fix_formatting March 5, 2024 22:55
@mpatou mpatou closed this Mar 5, 2024
@mpatou mpatou deleted the fix_joins_schema branch April 2, 2024 16:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants