Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consolidate ferc1 outputs using Dagster asset factories #3147

Closed
zaneselvans opened this issue Dec 11, 2023 · 2 comments · Fixed by #3883
Closed

Consolidate ferc1 outputs using Dagster asset factories #3147

zaneselvans opened this issue Dec 11, 2023 · 2 comments · Fixed by #3883
Labels
community dagster Issues related to our use of the Dagster orchestrator ferc1 Anything having to do with FERC Form 1 good-first-issue Good issues for first-time contributors. Self-contained, low context, no credentials required.

Comments

@zaneselvans
Copy link
Member

zaneselvans commented Dec 11, 2023

The top portion of the pudl.output.ferc1 module contains a number of individual asset definitions for denormalized / output tables with very similar structures, which could be consolidated into a small number of asset factories using the pattern adopted in e.g. pudl.extract.ferc714 (after PR #3123). See Dagster's blog post Factory Patterns in Python for some more background on the factory design pattern, and its application to Dagster assets.

Note that the calls to pudl.helpers.organize_cols() found in the current FERC 1 output asset definitions are no longer required, as the ordering of columns in the database is determined by the resource definitions / database schema now. These calls are leftover from when we were producing dataframes for users on request rather than writing these tables to the database.

Note that some of these assets currently create new columns containing derived values, and those would need to be preserved, either with their own asset definitions, or some way of keeping track of which calculations should be done for what tables inside the asset factory.

@zaneselvans zaneselvans added ferc1 Anything having to do with FERC Form 1 dagster Issues related to our use of the Dagster orchestrator good-first-issue Good issues for first-time contributors. Self-contained, low context, no credentials required. labels Dec 11, 2023
@zaneselvans zaneselvans moved this from New to Backlog in Catalyst Megaproject Dec 11, 2023
@hfireborn
Copy link
Contributor

@catalyst-cooperative/com-dev Is this still open? I'd like to work on this as a first time contributor

@zaneselvans
Copy link
Member Author

Hey there! Yes, this is still open. I was thinking about this as a good one after getting your office hours signup. There are lots of other examples of asset factories floating around that you could use as a guide. If you have a chance to get the PUDL / Dagster local development environment running, this should be a pretty easy thing to test out locally.

hfireborn added a commit to hfireborn/pudl that referenced this issue Apr 11, 2024
@zaneselvans zaneselvans linked a pull request Apr 12, 2024 that will close this issue
@zaneselvans zaneselvans moved this from Backlog to In progress in Catalyst Megaproject Apr 12, 2024
@e-belfer e-belfer moved this from In progress to In review in Catalyst Megaproject Sep 30, 2024
github-merge-queue bot pushed a commit that referenced this issue Oct 1, 2024
* #3147 draft1

* [pre-commit.ci] auto fixes from pre-commit.com hooks

For more information, see https://pre-commit.ci

* cleaned up with linters and ran from container for precommit hooks

* [pre-commit.ci] auto fixes from pre-commit.com hooks

For more information, see https://pre-commit.ci

* updated draft of asset factory solution

* Update asset factory to include all standard output tables

* Update docstrings

* Update release notes

---------

Co-authored-by: hfeuer <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: hfireborn <[email protected]>
Co-authored-by: Zane Selvans <[email protected]>
@github-project-automation github-project-automation bot moved this from In review to Done in Catalyst Megaproject Oct 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community dagster Issues related to our use of the Dagster orchestrator ferc1 Anything having to do with FERC Form 1 good-first-issue Good issues for first-time contributors. Self-contained, low context, no credentials required.
Projects
Archived in project
2 participants