feat(incremental): optimize 'insert_overwrite' strategy (#1409) #1410
+44
−0
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
resolves #1409
docs
"N/A"
Problem
The
MERGE
statement is sub-optimized in BigQuery when it comes to only replace partitions in the'insert_overwrite'
strategy forincremental
modelsSolution
For the
insert_overwrite
strategy where we are looking to replace rows at the partition-level, there is a better solution and here is why:DELETE
orINSERT
statement is cheapest than aMERGE
statement.DELETE
statement in BigQuery is free at the partition-level.MERGE
statement it reduces the cost by 50.4% and the elapsed time by 35.2% (slot based and not on demand)Checklist
'insert_overwrite'
)