Clone modified incremental models as step #1 of dbt Cloud CI job (Snowflake only) #1477
Closed
graciegoheen
announced in
Archive
Replies: 1 comment 2 replies
-
FOLLOW UP: While this does successfully clone the modified incremental models, currently the This is likely not possible until |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
CAUTION: This solution will not work as cloning must be done via a run-operation for dbt to be aware of the objects prior to parsing. Until we're able to access
selected_resources
via the run-operation, cloning only the modified incremental models in a CI job is not possible.Background:
Imagine that you've created a Slim CI job in dbt Cloud.
Your CI job:
dbt build --select state:modified+
Now imagine you're dbt project looks like this:
When you open a PR that modifies
model_1
, your CI job will kickoff and build only the modified models and their downstream dependencies (in this case:model_1
,model_2
, andmodel_3
) into a PR-specific schema. This mimics the behavior of what will happen once the PR is merged into the main branch (so you have confidence that you're not introducing breaking changes), without requiring a build of your entire dbt project.But what happens when one of the modified models (or one of their downstream dependencies) is an incremental model?
Because your CI job is building modified models into a PR-specific schema, on the first execution of
dbt build --select state:modified+
the modified incremental model will be built in its entirety because it does not yet exist in the PR-specific schema aka is_incremental will be false.This can cause problems because:
full-refresh
of the incremental model passes successfully in your CI job but an incremental build of that same table in prod would fail when the PR is merged into main (think schema drift where on_schema_change config is set tofail
)We can alleviate the above problems by zero copy cloning these incremental models into our PR-specific schema as the first step of the CI job. This way, the incremental models already exist in the PR-specific schema when you first execute the command
dbt build --select state:modified+
so theis_incremental
flag will betrue
.Your CI jobs will run faster, and you're more accurately mimicking the behavior of "what will happen once the PR has been merged into main".
In the past, we've been able to accomplish a similar goal by cloning the entire production schema into the PR-specific schema following this approach. But with the introduction of the new selected_resources context variable in version 1.1, we can specifically clone only the modified+ incremental models which will decrease the execution time of our CI job.
Step 1:
Make sure you've updated your project to v1.1.
Step 2:
Ensure your target is set as
ci
for your dbt cloud CI job.Step 3:
Add the following
clone_modified_incrementals
macro to your dbt project:Step 4:
Add the macro as an
on-run-start
hook to yourdbt_project.yml
file:Disclaimers
on-run-start
instead of arun-operation
because theselected_resources
variable is not currently accessible when using the commandrun-operation
.Beta Was this translation helpful? Give feedback.
All reactions