-
Notifications
You must be signed in to change notification settings - Fork 190
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(catalog): Prepare existing catalog APIs for integration [1/3] #3820
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3820 +/- ##
==========================================
+ Coverage 77.89% 78.03% +0.13%
==========================================
Files 751 754 +3
Lines 94879 95258 +379
==========================================
+ Hits 73905 74333 +428
+ Misses 20974 20925 -49
|
CodSpeed Performance ReportMerging #3820 will degrade performances by 25.1%Comparing Summary
Benchmarks breakdown
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the """DEPRECATED: Please use
read()."""
docstrings, we can use the warnings
stdlib as well, something like: This is deprecated, please prefer using ... instead; version=0.5.0
|
||
def select(self, *columns: ColumnInputType) -> DataFrame: | ||
"""Returns a DataFrame from this table with the selected columns.""" | ||
return self.read().select(*columns) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, but I do like the UX and don't think read()
should take any arguments.
daft/catalog/__init__.py
Outdated
try: | ||
from daft.unity_catalog import UnityCatalog | ||
@staticmethod | ||
def _try_from_iceberg(obj: object) -> Catalog | None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think most accurately, from_pyiceberg
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True. I went for consistency with daft.read_iceberg
, but could go either way.
…ng APIs [2/3] (#3825) This is PR [2/3] for integrating #3805 to Daft/main. This PR adds several implementations with basic end-to-end sanity tests. Most notably, this PR includes the bi-directional implementations for Catalogs and Tables. These enable us to use python-backed implementations in rust (for daft-connect and daft-sql) while also being able to return Catalog/Table python objects for use with the python APIs. **Changes** 1. Adds several internal factories for attach/detaching supported objects to the session. 2. Adds `Catalog.from_pydict` to cover basic needs and some backwards compatibility. 3. Adds `Identifier.from_str` for pyiceberg style identifiers (these are entirely optional). 4. Flesh out the [minimal session actions](https://github.com/Eventual-Inc/Daft/compare/rchowell/catalog-1-of-3...rchowell/catalog-2-of-3?expand=1#diff-fc2305e1560f8f7d5a974f58325f4b231bc638d6bacebbc795b6f8fb924114ccR1884-R1895) for a release. 5. Adds Bindings for a single name resolution abstraction, will come in handy for case-normalization in the near future. 6. Updates daft-connect and daft-sql to use `attach_table` Part [3/3] will cutover all deprecated APIs to use these new APIs and remove the unused artifacts. These APIs are enumerated in #3820
…ng APIs [2/3] (#3825) This is PR [2/3] for integrating #3805 to Daft/main. This PR adds several implementations with basic end-to-end sanity tests. Most notably, this PR includes the bi-directional implementations for Catalogs and Tables. These enable us to use python-backed implementations in rust (for daft-connect and daft-sql) while also being able to return Catalog/Table python objects for use with the python APIs. **Changes** 1. Adds several internal factories for attach/detaching supported objects to the session. 2. Adds `Catalog.from_pydict` to cover basic needs and some backwards compatibility. 3. Adds `Identifier.from_str` for pyiceberg style identifiers (these are entirely optional). 4. Flesh out the [minimal session actions](https://github.com/Eventual-Inc/Daft/compare/rchowell/catalog-1-of-3...rchowell/catalog-2-of-3?expand=1#diff-fc2305e1560f8f7d5a974f58325f4b231bc638d6bacebbc795b6f8fb924114ccR1884-R1895) for a release. 5. Adds Bindings for a single name resolution abstraction, will come in handy for case-normalization in the near future. 6. Updates daft-connect and daft-sql to use `attach_table` Part [3/3] will cutover all deprecated APIs to use these new APIs and remove the unused artifacts. These APIs are enumerated in #3820
… abstractions [3/3] (#3830) This PR swaps existing functionality to be backed by the new session, catalog, and table APIs. **Changes** * Adds set_namespace and current_namespace for qualified name resolution control * Adds support for catalog-qualified and schema-qualified identifiers * Adds the new session APIs to daft.* top-level * `daft.register_python_catalog -> daft.attach_catalog` * `daft.unregister_catalog -> daft.detach_catalog` * `daft.read_table -> daft.read_table` (via session) * `daft.register_table -> daft.attach_table` * Ports existing rust tests for DaftMetaCatalog to the Session. _Context_ * #3820 * #3825
This is PR [1/3] for integrating #3805 to Daft/main – it's primary purpose is to establish the stable APIs and prepare the cutover to Session for attach/detach and creating temporary tables.
Changes
Part [2/3] will integrate the session actions (from #3805) to replace deprecated MetaCatalog APIs.
deprecated -> replacement
daft.register_python_catalog -> daft.attach_catalog
daft.unregister_catalog -> daft.detach_catalog
daft.read_table -> daft.read_table
(via session)daft.register_table -> daft.attach_table