Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add upsert data lake support #86

Closed
wants to merge 1 commit into from
Closed

feat: add upsert data lake support #86

wants to merge 1 commit into from

Conversation

mikix
Copy link
Contributor

@mikix mikix commented Nov 22, 2022

WIP

Description

Write to a delta lake location by using --output-format=delta (which is now also the default instead of parquet, though note that the files are still stored in parquet format inside a delta lake).

This commit adds a new dependency on 'delta-spark' and uses Apache Spark to manage the delta lake.

Fixes #75

Checklist

  • Consider if documentation (like in docs/) needs to be updated
  • Consider if tests should be added

@mikix mikix force-pushed the mikix/deltalake branch 2 times, most recently from de69537 to 9362c82 Compare November 28, 2022 14:11
@mikix mikix changed the title feat: add delta lake support feat: add upsert data lake support Nov 28, 2022
Write to a ACID upsert data lake location by using
--output-format=delta (which is now also the default instead of
parquet, though note that the files are still stored in parquet
format inside a data lake).
@mikix mikix closed this Dec 5, 2022
@mikix mikix deleted the mikix/deltalake branch December 5, 2022 14:39
@mikix
Copy link
Contributor Author

mikix commented Dec 5, 2022

Moved to #89

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support run-to-run deltas
1 participant