-
Notifications
You must be signed in to change notification settings - Fork 605
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Support Daft dataframe as a backend #8904
Comments
Hi @jaychia! We'd be happy to help with adding in a Daft backend for Ibis. Does or will Daft have a SQL interface? I'm assuming it's only the Python dataframe interface. Today, Ibis supports 3 Python dataframe backends:
I believe the Daft API is most similar to the Polars API. Thus, I'd suggest taking a look at the Polars backend and starting there for the implementation of Daft. Generally the process will be to get the backend started -- creating a connection, implementing enough functionality to get data into it ( Ibis defines over 300 operations, many that won't be applicable to every backend. You can see current coverage here: https://ibis-project.org/support_matrix. So it's completely fine to start with a MVP for the Daft backend and increase coverage over time. Let me know if you have any additional questions! I'd recommend essentially copying one of the existing backends (probably Polars), cutting it down, and working to get the test suite passing. |
Yes indeed - Daft is probably most similar to the Polars lazy API. We do not yet have a SQL frontend. Would https://github.com/ibis-project/ibis/blob/main/ibis/backends/polars/tests/conftest.py be a good place to start to implement a backend? |
@jaychia apologies for the slow response! yes, something like that -- you can take a look at the Polars implementation, get the basic tests passing, and go from there |
A quick update here: The team is actively looking at building up a SQL frontend to Daft. We have basic support up already, with a more extensive roadmap detailed here: https://github.com/orgs/Eventual-Inc/projects/8/views/1 That might end up being the easiest way to integrate ibis, given that we can use SQL as the narrow waist between the ibis and Daft backends. Let me know if that makes sense, and if that might be the better way forward? |
hi @jaychia, that sounds like it would be a great option. is there a specific SQL dialect daft is targeting? if so, we could probably re-use one of the existing SQL compilers within Ibis (provided by SQLGlot) |
No specific dialect at the moment, we're still building out SQL support in Daft and can provide more updates as we go along. IIUC then if we are compatible with any of SQLGlot's target dialects then we should be good to go? Am I understanding this correctly that Ibis does: cc @universalmind303 who is working on our SQL support |
Correct.
Not quite, but close, it's |
Wondering if a daft/ibis integration would be able to benefit from a distributed daft |
Daft does indeed have a distributed deltalake writer. Not sure how that would need to interface with Ibis though. We also recently implemented SQL support which might help pave the way for easy Ibis integration, using SQL as the handoff point. |
There is not, but we'd be happy to help someone get started working on it.
Yep, we would just map the That's effectively what we do with PySpark: https://github.com/ibis-project/ibis/blob/main/ibis/backends/pyspark/__init__.py#L985-L1020 |
@jaychia so do you think Daft is far enough for this ibis work to start? |
I'd love to help with that. How can we connect? |
Which new backend would you like to see in Ibis?
Hi! I would like to explore building a backend for Ibis for Daft (www.getdaft.io)
I am one of the maintainers of the project, and we have had some user interest in using Ibis as an interface for Daft. Daft is a distributed query engine built with a Python dataframe API, with most of its internals written in Rust.
We're not sure where to begin/how to think about potential integrations but would love some pointers. Primarily:
Excited for this :)
Code of Conduct
The text was updated successfully, but these errors were encountered: