-
Notifications
You must be signed in to change notification settings - Fork 609
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor(polars): use Select
op within polars backend
#10005
Comments
I think dropping those other operations makes it harder to understand sequences of operations conceptually, so I'd prefer not to drop them. That's how Ibis used to represent all selections and it was very difficult to understand the process by which expressions moved into SQL (or whatever else). Splitting things up into separate operations way isn't without trade-offs, but it helps a lot with isolating the various parts of the compilation pipeline and reasoning about what is and isn't allowed (structurally, typing-wise, and optimization-wise). |
+1 to using |
Doesn't the rewrite system help a bit with alleviating this? Besides the fusion code ( |
I believe all the DerefMap code depends on the various operations being split up. |
Ah, it does. Missed that file, thanks. |
Our other SQL backends convert
Project
/Filter
/Sort
/Distinct
into a singleSelect
operation. This fusion both results in simpler SQL, and results in these operations being (with some exceptions) commutative. In #9923 I added a test for this commutativity which is currently failing for thepolars
backend since we don't rewrite these queries toSelect
nodes.I think the easiest fix would be to use the same rewrites as the SQL backend to generate
SQL
nodes. With the deprecation (and future removal) of thedask
/pandas
backends, another option would be to dropProject
/Filter
/Sort
/Distinct
entirely internally and only make use of the more generalSelect
op.The text was updated successfully, but these errors were encountered: