-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(dbt): load dbt scaffold using package_data
#15502
Conversation
Current dependencies on/for this PR: This comment was auto-generated by Graphite. |
dbbb5c1
to
1f508a6
Compare
1f508a6
to
509c470
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can a dbt scaffold work without including it in package data?
if not, this bool shouldn't be an option, we should always expect a subdir containing the dbt project
{% if use_dbt_project_package_data_dir -%} | ||
package_data={ | ||
"{{ project_name }}": [ | ||
"dbt-project/*", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fyi not sure if you expected nested directories here but this might only do one level by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did expect nested directories here, but looks like /*
worked as I expected, since it built in cloud. But I'll use a recursive glob here, since that seems to be supported as well, and makes this behavior explicit: https://setuptools.pypa.io/en/latest/history.html#v62-3-0
@@ -89,6 +90,9 @@ def copy_scaffold( | |||
for target in profile["outputs"].values() | |||
] | |||
|
|||
if use_dbt_project_package_data_dir: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should just use the passed in dbt_project_dir and just error if the dbt_project_dir is not under the start path
The scaffold can run locally without package data. However, it is required when you deploy. I was planning on adding documentation to explain the changes required to deploy a Dagster + dbt project in a Python package to cloud. But to shield the user from this complexity, by default we don't enable this in the open source CLI. |
509c470
to
6c3d6b2
Compare
## Summary & Motivation Allow `dagster-dbt project scaffold` to load dbt projects from their path in `package_data`. This will be invoked in the dbt NUX. We hardcode `dbt-project` as the dbt project directory name for the `package_data`. This restraint could be loosened in the future. ## How I Tested These Changes - pytest - dbt NUX creates a pull request that builds to branch deployment: https://github.com/dagster-io/test-dagster-dbt-scaffold/pull/53 - Github actions that build a Dagster package utilizing `package_data` dagster-io/dagster-cloud-action#147
Summary & Motivation
Allow
dagster-dbt project scaffold
to load dbt projects from their path inpackage_data
. This will be invoked in the dbt NUX.We hardcode
dbt-project
as the dbt project directory name for thepackage_data
. This restraint could be loosened in the future.How I Tested These Changes
package_data
chore: add steps to build Python package with dbt project dagster-cloud-action#147