Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(dbt): load dbt scaffold using package_data #15502

Merged
merged 1 commit into from
Jul 27, 2023

Conversation

rexledesma
Copy link
Contributor

@rexledesma rexledesma commented Jul 25, 2023

Summary & Motivation

Allow dagster-dbt project scaffold to load dbt projects from their path in package_data. This will be invoked in the dbt NUX.

We hardcode dbt-project as the dbt project directory name for the package_data. This restraint could be loosened in the future.

How I Tested These Changes

@rexledesma
Copy link
Contributor Author

Current dependencies on/for this PR:

This comment was auto-generated by Graphite.

@rexledesma rexledesma force-pushed the rl/install-dbt-project-as-part-of-package branch 2 times, most recently from dbbb5c1 to 1f508a6 Compare July 26, 2023 00:23
@rexledesma rexledesma requested a review from sryza July 26, 2023 00:23
@rexledesma rexledesma self-assigned this Jul 26, 2023
@rexledesma rexledesma requested a review from shalabhc July 26, 2023 00:23
@rexledesma rexledesma marked this pull request as ready for review July 26, 2023 00:23
@rexledesma rexledesma force-pushed the rl/install-dbt-project-as-part-of-package branch from 1f508a6 to 509c470 Compare July 26, 2023 06:12
Copy link
Contributor

@shalabhc shalabhc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can a dbt scaffold work without including it in package data?
if not, this bool shouldn't be an option, we should always expect a subdir containing the dbt project

{% if use_dbt_project_package_data_dir -%}
package_data={
"{{ project_name }}": [
"dbt-project/*",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fyi not sure if you expected nested directories here but this might only do one level by default.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did expect nested directories here, but looks like /* worked as I expected, since it built in cloud. But I'll use a recursive glob here, since that seems to be supported as well, and makes this behavior explicit: https://setuptools.pypa.io/en/latest/history.html#v62-3-0

@@ -89,6 +90,9 @@ def copy_scaffold(
for target in profile["outputs"].values()
]

if use_dbt_project_package_data_dir:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should just use the passed in dbt_project_dir and just error if the dbt_project_dir is not under the start path

@rexledesma
Copy link
Contributor Author

can a dbt scaffold work without including it in package data? if not, this bool shouldn't be an option, we should always expect a subdir containing the dbt project

The scaffold can run locally without package data. However, it is required when you deploy. I was planning on adding documentation to explain the changes required to deploy a Dagster + dbt project in a Python package to cloud. But to shield the user from this complexity, by default we don't enable this in the open source CLI.

@rexledesma rexledesma force-pushed the rl/install-dbt-project-as-part-of-package branch from 509c470 to 6c3d6b2 Compare July 27, 2023 00:35
@rexledesma rexledesma merged commit 73b47a5 into master Jul 27, 2023
@rexledesma rexledesma deleted the rl/install-dbt-project-as-part-of-package branch July 27, 2023 00:44
benpankow pushed a commit that referenced this pull request Jul 27, 2023
## Summary & Motivation
Allow `dagster-dbt project scaffold` to load dbt projects from their
path in `package_data`. This will be invoked in the dbt NUX.

We hardcode `dbt-project` as the dbt project directory name for the
`package_data`. This restraint could be loosened in the future.

## How I Tested These Changes
- pytest
- dbt NUX creates a pull request that builds to branch deployment:
https://github.com/dagster-io/test-dagster-dbt-scaffold/pull/53
- Github actions that build a Dagster package utilizing `package_data`
dagster-io/dagster-cloud-action#147
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants