-
Notifications
You must be signed in to change notification settings - Fork 204
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
updates duckdb/motherduck load job, adds full ci for motherduck and updates docs #1674
Conversation
✅ Deploy Preview for dlt-hub-docs canceled.
|
@@ -1,11 +1,10 @@ | |||
--- | |||
title: 🧪 MotherDuck | |||
title: MotherDuck |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
qualified_table_name, threading.Lock() | ||
) | ||
source_format = "read_parquet" | ||
options = ", union_by_name=true" | ||
elif self._file_path.endswith("jsonl"): | ||
# NOTE: loading JSON does not work in practice on duckdb: the missing keys fail the load instead of being interpreted as NULL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Judging from the work on my datasets PR this is not or no longer true. I have a test there that migrates a table and it still works in duckdb with json and parquet.
# we will use a different pipeline with a separate schema but writing to the same dataset and to the same table | ||
# the table schema is identical to the previous one with a single field ("time") added | ||
# this will create a different order of columns than in the destination database ("time" will map to "_dlt_id") | ||
# duckdb copies columns by column index so that will fail |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand the last line, afaik you are testing the union_by_name here and this should NOT fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Description
read_json
without using COPY FROM which allows skipping of the fields