Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add fast loading methods for bigquery destination #1037

Closed
rudolfix opened this issue Feb 29, 2024 · 0 comments
Closed

add fast loading methods for bigquery destination #1037

rudolfix opened this issue Feb 29, 2024 · 0 comments
Assignees
Labels
community This issue came from slack community workspace

Comments

@rudolfix
Copy link
Collaborator

Background
Current method using load jobs is optimized for large loads. However most of the load packages is quite small. We can improve loading speeds by implementing streaming insert copy jobs. There's a singer target https://github.com/z3z1ma/target-bigquery by @z3z1ma from where we can take code.

we'll implement unconstrained (schema-less) version of the above when #891 in merged. this is about extending existing destination

Tasks

    • allow user to select the loading API via destination configuration and per resource/table via bigquery adapter
    • implement insert api and optionally storage write api
    • allow both parquet and jsonl to be loaded this way. consider adding standard file readers for those types (port them from data sink destination #891 )
    • tests and documentation
@rudolfix rudolfix added the community This issue came from slack community workspace label Mar 1, 2024
@rudolfix rudolfix moved this from Todo to In Progress in dlt core library Mar 8, 2024
@rudolfix rudolfix closed this as completed Apr 8, 2024
@github-project-automation github-project-automation bot moved this from In Progress to Done in dlt core library Apr 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community This issue came from slack community workspace
Projects
Status: Done
Development

No branches or pull requests

2 participants