-
-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rename MCOE and plant part list assets #2904
Conversation
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## rename-core-assets #2904 +/- ##
==================================================
Coverage 88.5% 88.5%
==================================================
Files 90 90
Lines 10808 10819 +11
==================================================
+ Hits 9570 9580 +10
- Misses 1238 1239 +1
☔ View full report in Codecov by Sentry. |
src/pudl/analysis/mcoe.py
Outdated
compute_kind="Python", | ||
io_manager_key="pudl_sqlite_io_manager", | ||
description=f"{agg_freqs[freq].title()} heat rate estimates by generation unit. Generation " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added the descriptions to the assets because their metadata will be removed from pudl.metadata.resources
when we deprecate PudlTabl
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this true for other assets created through asset factories, or is there something special about this set?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is true for other intermediate assets that are currently written to the database because we provide access to them via PudlTabl
. These assets will use the default IO manager when we deprecate PudlTabl
.
There are a handful of other output intermediate assets that are being written to the database that don't have descriptions specified in the asset decorators. I can remove these descriptions for now and handle it when we deprecate PudlTabl
.
if exclude_intermediate_resources: | ||
[ | ||
resource | ||
for resource in self.resources | ||
if not resource.name.startswith("_") | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added this if we want to exclude intermediate assets from datasette. I think we should include them for now so the database, data dictionary and datasette are all consistent.
@@ -183,12 +183,10 @@ | |||
for freq in AGG_FREQS | |||
} | |||
| { | |||
f"mcoe_generators_{freq}": { | |||
f"out_eia__{freq}_generators": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We decided to add all generator attributes to this table so it's a one stop shop for users.
@@ -114,7 +114,7 @@ def out_eia__yearly_plants( | |||
}, | |||
compute_kind="Python", | |||
) | |||
def out_eia__yearly_generators( | |||
def _out_eia__yearly_generators( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made this an intermediate table because the old mcoe_generators_{freq}
table (now out_eia__{freq}_generators
) has all of the same attributes plus the valuable derived attributes.
@ella the migrations changed on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One blocking question re the plant_parts
asset name, otherwise just some comments and questions for clarification. Fast ETL worked perfectly for me out of the box.
src/pudl/analysis/mcoe.py
Outdated
compute_kind="Python", | ||
io_manager_key="pudl_sqlite_io_manager", | ||
description=f"{agg_freqs[freq].title()} heat rate estimates by generation unit. Generation " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this true for other assets created through asset factories, or is there something special about this set?
src/pudl/analysis/plant_parts_eia.py
Outdated
io_manager_key="pudl_sqlite_io_manager", | ||
compute_kind="Python", | ||
) | ||
def plant_parts_eia_asset( | ||
mega_generators_eia: pd.DataFrame, | ||
def out_eia__plant_parts( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This asset name doesn't seem to fit with our existing model - doesn't it need some kind of asset type (probably yearly?).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
asset type can be optional. I didn't think there was logical frequency or type for the table though I could be wrong. @cmgosnell and @katie-lamb what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the plant parts are assigned annually, but correct me if I'm wrong @cmgosnell
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's an annual table! (it technically could be generated as a monthly table but it would be ginombo and rn this is mostly being used to link up to annual ferc data)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks! It has been renamed to out_eia__yearly_plant_parts
.
This is ready for another review @e-belfer! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good and fast ETL runs as is.
PR Overview
This PR:
Package.get_sorted_resourced()
method so we can order the tables in the data dictionary and datasette.JINJA_FILTERS
topudl.metadata.helpers
that can be added to jinja environments. I added this because I was getting a malformed cross ref sphinx error on intermediate table names because of the preceding underscore.PR Checklist
dev
).