-
-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Apply new naming conventions to devtool
notebooks
#3228
Merged
Merged
Changes from all commits
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
5e2a5e4
Apply new naming conventions to devtool notebooks
bendnorman dad6d03
Clean devtools notebook outputs
bendnorman b3bbbf3
Resolve special case for core_ferc1__yearly_steam_plants_sched402 and…
bendnorman d9ee46f
Update devtools/sqlite-table-diff.ipynb to use new table names
bendnorman b4baedb
Merge branch 'main' into update-table-names-devtools
bendnorman File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -6,7 +6,7 @@ | |
"metadata": {}, | ||
"source": [ | ||
"# Inspecting dagster assets\n", | ||
"This notebooks allows you to inspect dagster asset values.\n", | ||
"This notebooks allows you to inspect dagster asset values. **This is just a template notebook. Do your asset explorations in a copy of this notebook.** \n", | ||
"\n", | ||
"Some assets are written to the database in which case you can just pull the tables into pandas or explore them in the database. However, many assets use the default IO Manager which writes asset values to the `$DAGSTER_HOME/storage/` directory as pickle files. Dagster provides a method for inspecting asset values no matter what IO Manager the asset uses." | ||
] | ||
|
@@ -50,61 +50,8 @@ | |
"\n", | ||
"from pudl.etl import defs\n", | ||
"\n", | ||
"asset_key = \"exploded_balance_sheet_assets_ferc1\"\n", | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I removed some custom asset exploration work that got committed here. This is just suppose to be a template notebook. |
||
"df = defs.load_asset_value(AssetKey(asset_key))\n", | ||
"\n", | ||
"#df[df.row_type_xbrl == \"correction\"].xbrl_factoid.value_counts()\n", | ||
"#df[(df.xbrl_factoid.isin([\"operation_expense\", \"maintenance_expense\"]))&(df.rel_diff.notnull())&(df.rel_diff!=0)].sort_values(['utility_id_ferc1', 'report_year', 'xbrl_factoid', 'rel_diff']).head(50)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "b2d99594", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"df[(df.xbrl_factoid==\"accumulated_depreciation\")&(df.plant_status==\"in_service\")&(df.plant_function==\"total\")]" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "467111b1", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"df[df.xbrl_factoid.isin(factoids)&(df.utility_id_ferc1==9)&(df.report_year==1998)]" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "c6f7427a", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"factoids = ['distribution_maintenance_expense_electric',\n", | ||
" 'hydraulic_power_generation_maintenance_expense',\n", | ||
" 'maintenance_of_general_plant',\n", | ||
" 'nuclear_power_generation_maintenance_expense',\n", | ||
" 'other_power_generation_maintenance_expense',\n", | ||
" 'regional_market_maintenance_expense',\n", | ||
" 'steam_power_generation_maintenance_expense',\n", | ||
" 'transmission_maintenance_expense_electric']" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "951b718d", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"asset_key = \"calculation_components_xbrl_ferc1\"\n", | ||
"calcs = defs.load_asset_value(AssetKey(asset_key))\n", | ||
"\n", | ||
"calcs[(calcs.xbrl_factoid_parent == \"accumulated_depreciation\")].head(50)" | ||
"asset_key = \"_core_eia861__balancing_authority\"\n", | ||
"df = defs.load_asset_value(AssetKey(asset_key))" | ||
] | ||
}, | ||
{ | ||
|
@@ -128,25 +75,11 @@ | |
"\n", | ||
"from pudl.etl import defs\n", | ||
"\n", | ||
"asset_key = \"emissions_unit_ids_epacems\"\n", | ||
"asset_key = \"core_eia923__monthly_generation\"\n", | ||
"df = defs.load_asset_value(AssetKey(asset_key))\n", | ||
"\n", | ||
"df" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "9f0d118b", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"from pudl.output.epacems import epacems\n", | ||
"\n", | ||
"test_epacems = epacems(states = [\"ID\"], years = [2022])\n", | ||
"\n", | ||
"test_epacems[test_epacems.operating_datetime_utc>=\"2022-01-04\"].head(40)" | ||
] | ||
} | ||
], | ||
"metadata": { | ||
|
@@ -165,7 +98,7 @@ | |
"name": "python", | ||
"nbconvert_exporter": "python", | ||
"pygments_lexer": "ipython3", | ||
"version": "3.11.5" | ||
"version": "3.11.7" | ||
} | ||
}, | ||
"nbformat": 4, | ||
|
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cmgosnell I ran into some FERC related errors I wasn't sure how to solve in the last two cells of this notebook.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay on the last cell i believe is a @katherinelamb / @zschira question/ verification of my assumption: it looks like the new FERC plant classifier got pulled out of the transform step and thus this special exception of needing to pass in the fuel table into the steam tables' transform is no longer needed! so if that's true i think we can delete this cell and take out the
if table_name == "core_ferc1__yearly_steam_plants_sched402":
in the previous cell.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and the second to last cell failed while validating the calculations in the table. the expected error rates were all set using full and fast etl settings and this notebook has a cell up top that set
years = [2020, 2021]
if you change it toyears = [2020, 2022]
and rerun the notebook this will work fine.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is definitely a fragile part of testing the transform step because BELIEVE IT OR NOT the newer years are much less clean in the calculations than the old years.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cmgosnell, yes that's true!
core_ferc1__yearly_steam_plants_sched402
is now handled like any other transform, and no longer has plant IDsThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks y'all! Just made the changes.