Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

overwrite filter strategy #1343

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

stankiewicz
Copy link
Contributor

Goal - Bring more flexibility/strategies into incremental models. updatePartitionFilter is not enough.
Additional insert-overwrite strategy with DML (delete) before model execution. Delete statement should select partitions dynamically or based on static input.

Why it's needed:
Customers are calculating expensive aggregates. Sometimes there are no unique keys in input and output - incremental model is append only and leveraging pre-statement is error prone.

Solution suggested:
Adapter for incremental tables should support:

  • Append only (especially if no keys are provided)
  • Merge statement if keys are provided
  • Insert overwrite via delete from and insert

Partitioning should not be enforced, some SCD tables, like in data vault can be clustered only.

Insert overwrite strategy allows setting a overwrite_filter:

  • Default (empty) - when partition_by is used, then there will be DML invoked that is running DELETE from based on columns used with partition_by, otherwise it will fail
  • Custom like overwrite_filter = "current_date()" or overwrite_filter=${dataform.projectConfig.vars.date}

@lewish lewish changed the base branch from master to main July 20, 2022 15:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant