Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

leverage ibis expression for getting readablerelations #2046

Merged
merged 27 commits into from
Dec 10, 2024
Merged
Show file tree
Hide file tree
Changes from 9 commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
8c111e9
add ibis dataset in own class for now
sh-rp Nov 13, 2024
f928279
make error clearer
sh-rp Nov 13, 2024
4048b3c
fix some linting and fix broken test
sh-rp Nov 13, 2024
830af97
make most destinations work with selecting the right db and catalog, …
sh-rp Nov 19, 2024
e289ad1
add missing motherduck and sqlalchemy mappings
sh-rp Nov 19, 2024
b6850e8
casefold identifiers for ibis wrapper calss
sh-rp Nov 19, 2024
34323da
re-organize existing dataset code to prepare ibis relation integration
sh-rp Nov 25, 2024
0eb6f58
integrate ibis relation into existing code
sh-rp Nov 25, 2024
c06525b
re-order tests
sh-rp Nov 25, 2024
48e4034
fall back to default dataset if table not in schema
sh-rp Nov 26, 2024
1fb17e0
make dataset type selectable
sh-rp Nov 26, 2024
f19a98d
add dataset type selection test and fix bug in tests
sh-rp Nov 26, 2024
acd329f
update docs for ibis expressions use
sh-rp Nov 26, 2024
afc2f06
ensure a bunch of ibis operations continue working
sh-rp Dec 5, 2024
1375fb1
add some more tests and typings
sh-rp Dec 6, 2024
d92f2f1
fix typing (with brute force get_attr typing..)
sh-rp Dec 6, 2024
f22b1dc
move ibis to dependency group
sh-rp Dec 6, 2024
7af8870
move ibis stuff to helpers
sh-rp Dec 6, 2024
3bbad35
Merge branch 'devel' into exp/ibis_expressions
sh-rp Dec 6, 2024
e50b5b5
post devel merge, put in change from dataset, update lockfile
sh-rp Dec 6, 2024
1ed3ff3
add ibis to sqlalchemy tests
sh-rp Dec 6, 2024
bedfb05
improve docs a bit
sh-rp Dec 6, 2024
ffb0ce1
fix ibis dep group
sh-rp Dec 6, 2024
71dca18
fix dataset snippets
sh-rp Dec 6, 2024
3886638
fix ibis version
sh-rp Dec 6, 2024
289e289
add support for column schema in certion query cases
sh-rp Dec 6, 2024
94fd4aa
Merge branch 'devel' into exp/ibis_expressions
rudolfix Dec 10, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
58 changes: 55 additions & 3 deletions dlt/common/libs/ibis.py
Original file line number Diff line number Diff line change
@@ -1,12 +1,14 @@
from typing import cast
from typing import cast, Any

from dlt.common.exceptions import MissingDependencyException

from dlt.common.destination.reference import TDestinationReferenceArg, Destination, JobClientBase
from dlt.common.schema import Schema
from dlt.destinations.sql_client import SqlClientBase

try:
import ibis # type: ignore
from ibis import BaseBackend
import sqlglot
from ibis import BaseBackend, Expr
except ModuleNotFoundError:
raise MissingDependencyException("dlt ibis Helpers", ["ibis"])

Expand All @@ -29,6 +31,22 @@
]


# Map dlt data types to ibis data types
DATA_TYPE_MAP = {
"text": "string",
"double": "float64",
"bool": "boolean",
"timestamp": "timestamp",
"bigint": "int64",
"binary": "binary",
"json": "string", # Store JSON as string in ibis
"decimal": "decimal",
"wei": "int64", # Wei is a large integer
"date": "date",
"time": "time",
}


def create_ibis_backend(
destination: TDestinationReferenceArg, client: JobClientBase
) -> BaseBackend:
Expand Down Expand Up @@ -119,3 +137,37 @@ def create_ibis_backend(
con = ibis.duckdb.from_connection(duck)

return con


def create_unbound_ibis_table(
sql_client: SqlClientBase[Any], schema: Schema, table_name: str
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we move ibis module to helpers? you are importing modules that are not in common. so we really need to refer to ibis in common?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did that (see the commit) but there is one place in common where the IbisBackend typing is imported for the readable relation.

) -> Expr:
"""Create an unbound ibis table from a dlt schema"""

if table_name not in schema.tables:
raise Exception(
f"Table {table_name} not found in schema. Available tables: {schema.tables.keys()}"
)
table_schema = schema.tables[table_name]

# Convert dlt table schema columns to ibis schema
ibis_schema = {
sql_client.capabilities.casefold_identifier(col_name): DATA_TYPE_MAP[
col_info.get("data_type", "string")
]
for col_name, col_info in table_schema.get("columns", {}).items()
}

# normalize table name
table_path = sql_client.make_qualified_table_name_path(table_name, escape=False)

catalog = None
if len(table_path) == 3:
catalog, database, table = table_path
else:
database, table = table_path

# create unbound ibis table and return in dlt wrapper
unbound_table = ibis.table(schema=ibis_schema, name=table, database=database, catalog=catalog)

return unbound_table
Loading
Loading