v2023.12.01
github-actions
released this
05 Dec 19:15
·
898 commits
to main
since this release
What's Changed
- Dbf xbrl mapping by @zaneselvans in #2088
- eia860m september update by @cmgosnell in #2079
- integrate the elecrtric energy source dbf & xbrl tbl by @cmgosnell in #2094
- fix ferc1 record_id validation errors by @cmgosnell in #2102
- Electric Dispositions Table by @cmgosnell in #2100
- Use app token for auto-merging bot PRs when CI passes. by @zaneselvans in #2106
- fix table name in record_id test by @cmgosnell in #2111
- Transform f1 xmssn line by @aesharpe in #2103
- Map
f1_bal_sheet_cr
by @aesharpe in #2113 - Utility plant summary by @cmgosnell in #2105
- Allow Tox v4+ in the dev extras environment. by @zaneselvans in #2117
- Bump ferc-xbrl-extractor version to avoid Arelle locale issue by @zschira in #2118
- Map
f1_elc_op_mnt_expn
table by @aesharpe in #2114 - Map
f1_comp_balance_db
table by @aesharpe in #2112 - Merge release branch updates into our working
dev
branch. by @zaneselvans in #2133 - Add the
balance_sheet_assets_ferc1
table by @cmgosnell in #2127 - Xbrl metadata restructuring by @zaneselvans in #2136
- Add AWS creds to build-deploy-pudl action and copy outputs to s3 bucket by @bendnorman in #2137
- Mitigate zenodo dependency in docs build by @zaneselvans in #2150
- Update to new version of FERC XBRL Extractor by @zaneselvans in #2151
- Transform
f1_dacs_epda
by @zschira in #2143 - Add depreciation_amortization_summary_ferc1 to non-unique record ID's… by @zschira in #2154
- Integrate
income_statement_ferc1
table by @cmgosnell in #2147 - Move awscli from pudl package to docker image by @bendnorman in #2163
- Ferc1 xbrl table release notes by @zaneselvans in #2157
- Transform
f1_bal_sheet_cr
by @aesharpe in #2134 - Transform
f1_retained_erng
xbrl + dbf by @cmgosnell in #2155 - Transform
f1_elc_op_mnt_expn
by @aesharpe in #2162 - Transform
f1_accumdepr_prvsn
dbf + xbrl by @zaneselvans in #2119 - Replace lingering transmission_ferc1 w/ transmission_statistics_ferc1 by @zaneselvans in #2178
- Make transform params stricter 2 by @aesharpe in #2177
- Validate the raw ferc1 tables in the settings by @cmgosnell in #2168
- Update & simplify PR template formatting / language. by @zaneselvans in #2181
- Integrate
f1_cash_flow
FERC1 table by @cmgosnell in #2184 - Transform electric_plant_depreciation_functional_ferc1 DBF + XBRL by @zaneselvans in #2183
- Delete old notebooks by @jdangerx in #2186
- Transform f1_elctrc_oper_rev by @zschira in #2192
- Add direct S3 nightly build download links to README. by @zaneselvans in #2199
- Update documentation to refer to
archiver
and notscrapers
orzenodo-storage
by @jdangerx in #2190 - Notify community-dev channel by @bendnorman in #2211
- Transform f1 othr reg liab by @zschira in #2215
- Add other_regulatory_liabilities_ferc1 to the list of non-unique reco… by @zschira in #2222
- Change FuelFix for nuclear from mmmbtu to mmbtu by @aesharpe in #2233
- Xbrl test speedups by @zschira in #2229
- Restrict the FERC1 output tables with the PudlTabl's start/end date by @cmgosnell in #2238
- Fix fuel ferc1 expected values by @aesharpe in #2241
- add methods to PudlTabl so it can be serialized and de-serialized (v2) by @arengel in #2251
- Retain all reported EIA sector codes for harvesting by @knordback in #2200
- Pin SQLAlchemy<2.0 and allow pip 23 by @zaneselvans in #2268
- Use Workload Identity Federation in GH Actions by @jdangerx in #2259
- Retain all EIA sector IDs for harvesting by @zaneselvans in #2270
- Split EIA extract steps and add field types to dataset_settings by @bendnorman in #2263
- Dagster cli wrapper by @zschira in #2272
- Add design process documentation by @jdangerx in #2282
- Remove tables from settings by @zschira in #2286
- Add more service accounts to Workload Identity Federation by @jdangerx in #2273
- Add EIA 176 to sources.py by @e-belfer in #2258
- Docs updates for Annual Updates by @aesharpe in #2089
- Add
electricity_sales_by_rate_schedule_ferc1
table by @aesharpe in #2205 - Rework of FERC to EIA logistic regression model by @katie-lamb in #2276
- Add fuel allocation release notes by @zaneselvans in #2308
- Extract 860 EnviroAssoc and EnviroEquip Tables in PUDL by @e-belfer in #2281
- Integrate FERC-EIA record linkage into PUDL by @cmgosnell in #2224
- Update unit tests for allocate net gen by @jdangerx in #2297
- Fix google auth error in tox-pytest by @jdangerx in #2311
- Ferc to EIA match release notes by @katie-lamb in #2313
- Convert FERC1 -> EIA missing ID validation ET[L] to Dagster by @jdangerx in #2309
- Update integration tests to work with Dagster ETL by @zschira in #2299
- Convert epacems_to_parquet command to run dagster asset by @bendnorman in #2300
- Add spot fix function/class by @e-belfer in #2254
- Update previous balancing_authority_code_eia fixes for plants_eia860 by @e-belfer in #2312
- Breakpoints in Dagster by @jdangerx in #2322
- Fix balancing_authority_name update in cases where no ba_name_to_code_map() by @e-belfer in #2323
- Update doi to point to new epacamd_eia archive by @aesharpe in #2316
- Update local cache when using
--gcs-cache-path
by @jdangerx in #2326 - Resolve dev -> dagster merge fixes by @bendnorman in #2318
- Configure dagster env vars from settings if not set already by @zschira in #2332
- Remove code deprecated by dagster by @zschira in #2341
- Run nightly builds in dagster-world by @jdangerx in #2344
- Update s3 bucket urls to use https by @bendnorman in #2351
- Add jobs for excluding EPA CEMS assets by @bendnorman in #2343
- Merge s3 readme url changes into
dev
by @bendnorman in #2355 - Add boiler-associated attributes from EIA 860 6.2 EnvrEquip tables to ETL by @e-belfer in #2319
- Parameterize reconstructable jobs to set loglevel by @bendnorman in #2348
- Merge
dev
intodagster-asset-etl
once again by @jdangerx in #2353 - Make nightly build message a little cleaner. by @jdangerx in #2363
- Correct eia transform doc strings by @bendnorman in #2357
- Add EIA 860 EnviroAssoc/EnviroEquip boiler and emissions control attributes by @zaneselvans in #2364
- Fix inconsistently reported leading zeros in EIA boiler id by @zaneselvans in #2367
- Consolidate EPA CEMS parquet files by @jdangerx in #2354
- Update
main
as the nightly builds ondev
have succeeded. by @zaneselvans in #2374 - Reorder PK of boilers_eia860 to group data like other tables by @zaneselvans in #2375
- Remove deprecated ferc extract functions by @bendnorman in #2369
- Fix Bandit issues by @jdangerx in #2381
- Small tweaks from docs PR by @jdangerx in #2372
- Move
--no-sign-request
to the end of the command in the nightly_data_build docs by @aesharpe in #2388 - Bump allowed range of Dagster versions to >=1.1,<1.3 by @zaneselvans in #2382
- Rename electric_opex_ferc1 table electric_operating_expenses_ferc1 by @zaneselvans in #2392
- Dagster filter table schemas by @zschira in #2393
- Dagster ETL documentation by @bendnorman in #2306
- Move
prime_mover_code
back intoboiler_fuel_eia923
by @cmgosnell in #2362 dev
->dagster-asset-etl
merge by @jdangerx in #2398- Rename PUDL_CACHE to PUDL_INPUT by @jdangerx in #2401
- Throw error when db is missing table schema by @bendnorman in #2410
- Standardize asset group and asset module names by @zaneselvans in #2411
- Rename check_pudl_fks to pudl_check_fks to align w/ other CLI names by @zaneselvans in #2416
- Pin grpcio==1.46.1 (arm64 compatible version available on conda-forge) by @zaneselvans in #2428
- Convert devtools notebooks to use dagster concepts by @bendnorman in #2356
- Add sqlite timeout by @bendnorman in #2430
- Get PUDL env settings under test by @jdangerx in #2424
- Convert FERC-714 ETL to use Dagster by @zaneselvans in #2421
- Simplify EPA CAMD EIA X-walk output to just read from DB. by @zaneselvans in #2440
- Expand table descriptions. by @zaneselvans in #1910
- Teach Docker about new env vars by @jdangerx in #2443
- Dagster asset etl by @bendnorman in #2104
- Fix "Make a PR" link in nightly build success message by @jdangerx in #2449
- Add new settings file behavior in docs by @bendnorman in #2441
- Convert EIA-861 ETL to use Dagster by @zaneselvans in #2403
- Integrate EIA-923 Annual Environmental Information (Schedule 8C) spreadsheet maps by @zaneselvans in #2447
- Throw error if database metadata has changed by @bendnorman in #2331
- Untangle eia_transform / harvesting / bga multi-asset by @zaneselvans in #2450
- Solve
2023-03-29
nightly build failure issues by @bendnorman in #2469 - Create
PudlSQLiteIOManager
to accept aPackage
object by @bendnorman in #2466 - Nightly build 2023-03-30 by @zaneselvans in #2473
- Update PUDL from Python 3.10 to 3.11 by @zaneselvans in #2408
- Use enforce_schema() and read-chunking in PudlSQLiteIOManager. by @zaneselvans in #2459
- Add
PUDL_INPUT
andPUDL_OUTPUT
vars tozenodo-cache-sync
action by @bendnorman in #2476 - Move project metadata & build specs from setup.py to pyproject.toml by @zaneselvans in #2479
- Parallelize Dagster processing of EPA CEMS by @zschira in #2472
- Retry on md5 mismatch in GCS by @jdangerx in #2488
- Parallelize tests by @jdangerx in #2432
- Use FERC XBRL Extractor 0.8.2 to allow pandas 2.0 by @zaneselvans in #2492
- Dagsterize eia tables by @e-belfer in #2496
- Create simple SQL view assets by @bendnorman in #2445
- Create Dynamic PudlTabl methods by @zschira in #2498
- Add funding and release notes URLs to the project. by @zaneselvans in #2507
- Add script to set DB schema before running ETL pipeline by @jdangerx in #2515
- Add 1:m matches into plant_parts_eia by @e-belfer in #2429
- Move ferc eia manual mapping notebook 2 by @aesharpe in #2502
- Dagsterize EIA output tables by @zaneselvans in #2519
- Validate and save csv of all 1:m FERC-EIA matches by @e-belfer in #2516
- Revive comdev notify action by @bendnorman in #2538
- Ferc output conversion by @e-belfer in #2521
- Flesh-out the Sub Plant ID by @cmgosnell in #2491
- Dagsterize output tables by @e-belfer in #2534
- Manage pudl.sqlite schema with alembic by @jdangerx in #2523
- Initial refactoring of dbf extraction process by @rousik in #2536
- Improve dagster docs by @bendnorman in #2451
- Cast resource filter values to lowercase by @rousik in #2562
- Fix income_statement_ferc1 utility_type categorization bug by @zaneselvans in #2565
- Update to use 2021 version of the epacamd_eia crosswalk by @zschira in #2566
- Integrate greg M's PM code fix by @cmgosnell in #2446
- Revert crosswalk to 2018 by @zschira in #2582
- Use multiple years in crosswalk by @zschira in #2580
- Emission control table by @aesharpe in #2561
- Merge two 'head' migrations together by @zschira in #2590
- Clean epacamd_eia mismatches before epacems creation by @zschira in #2593
- Allow JupyterLab v4.0.0 in pudl-dev environment. by @zaneselvans in #2594
- link straight to development setup page by @AppTrain in #2592
- Fix rouge operational status codes causing foreign key failure by @aesharpe in #2602
- Add encoding step to the transform step for the emissions_control_equ… by @aesharpe in #2617
- Compare sub-total calculations to total calculations for XBRL explosion by @e-belfer in #2615
- migrate calcuation checks into the ferc1 table transformers by @cmgosnell in #2618
- Add records correcting FERC 1 calculations that are off by @zaneselvans in #2620
- Boiler cooler stack flue association tables by @aesharpe in #2587
- Boostrap for the run-etl action by @rousik in #2631
- Boiler cooler stack flue table (bb fixes) by @aesharpe in #2629
- Update max rows for boils_eia860, plants_eia860, and pu_eia860 tables… by @aesharpe in #2634
- Fix job names in docs by @dstansby in #2641
- add minor inter table calc fixes by @cmgosnell in #2635
- Add business_model, service_type to sales_eia861 PK by @zaneselvans in #2637
- Integrate FERC Form 2 dbf formats into ferc_to_sqlite by @rousik in #2564
- Rename and test XBRL metadata calculations by @cmgosnell in #2563
- Use PUDL_INPUT not hard-coded data dir in datastore CLI by @zaneselvans in #2651
- update expected # of plant_in_service rows by @cmgosnell in #2650
- Fix ferc2 full etl integration test issues by @rousik in #2652
- Filter FERC714 ETL by year by @e-belfer in #2649
- Minor Documentation changes by @AppTrain in #2598
- Add release notes explaning FERC 1 metadata cleanup by @zaneselvans in #2626
- Manage duplicate PKs in EIA-861 table transforms by @zaneselvans in #2648
- Nightly build 2023 06 14 by @zaneselvans in #2668
- Replace [email protected] with github discussion link on datasette by @bendnorman in #2665
- Throw different exception when dbc file is missing by @rousik in #2654
- Read EIA860 data in parallel by @dstansby in #2644
- Deduplicate FERC 2 respondent IDs by @zaneselvans in #2661
- Integrate FERC Form 6 from dbf by @rousik in #2595
- Fix
retained_earnings_ferc1
transform by @e-belfer in #2645 - Tweaks to emissions control equipment table by @zaneselvans in #2664
- Upgrade to ferc-xbrl-extractor v0.8.3 by @zaneselvans in #2675
- Fix Alembic migration diversions in dev by @e-belfer in #2681
- Add publish_destinations field to the ETL config by @rousik in #2659
- Convert EIA-861 and FERC 714 service territory outputs to Dagster assets by @e-belfer in #2550
- Publish ferc6.sqlite to Datasette. Add AWS download link. by @zaneselvans in #2686
- Spot fix ferc exploder by @aesharpe in #2647
- Bring the Census DP1 to SQLite ETL into dagster by @e-belfer in #2621
- Add
electric_plant_depreciation_changes_ferc1
into explosion by @e-belfer in #2662 - Add
electric_plant_depreciation_functional_ferc1
into the calc checking process by @e-belfer in #2687 - Update plant_in_service_ferc1 expected row count by @zaneselvans in #2690
- Allow setuptools 68.0.0 by @zaneselvans in #2692
- Fix apparent typos in FERC 1 rename params. by @zaneselvans in #2691
- Remove [email protected] from docs by @bendnorman in #2670
- ferc1 💥 rate base tags by @cmgosnell in #2697
- fix straggler problems with the rightsutility rename by @cmgosnell in #2704
- Pin grpcio to 1.56 by @zaneselvans in #2702
- Fix xbrl metadata renaming in the
electric_plant_depreciation_functional_ferc1
table by @cmgosnell in #2712 - Limit dagster concurrency in nightly builds by @bendnorman in #2713
utility_plant_summary_ferc1
toplant_in_service_ferc1
link by @cmgosnell in #2715- add straggler dbf-only factoids into ferc1 table metadata by @cmgosnell in #2716
- Convert EIA generation and fuel allocations to Dagster by @zaneselvans in #2527
- Remove refs to 2i2c JupyterHub. Move nightly build links to data access page by @zaneselvans in #2719
- Link electric OpEx to income statement table by @zaneselvans in #2723
- Identify and link inter-table relationships when they occur within a dimension (e.g.
utility_type
) by @e-belfer in #2669 - Organize FERC 1 XBRL metadata with a calculation forest by @zaneselvans in #2653
- delete it all!!!! lol remove the no-longer needed duplication removals by @cmgosnell in #2724
- Fix duplicated components in balance sheet liabilities FERC1 table by @e-belfer in #2727
- Make a calculation component table and use in the inter-table/inter-dimension checking by @cmgosnell in #2721
- Extract FERC Form 60 DBF data to SQLite by @e-belfer in #2734
- Add old FERC 60 DBF data to the Datasette deployment. by @zaneselvans in #2739
- Fix small typos by @bendnorman in #2725
- Replace references to
dagit
with dagster UI anddagster-webserver
by @bendnorman in #2749 - Inject missing dbf-only factoids into XBRL metadata by @e-belfer in #2747
- Add parent dimensions into calculation component table by @cmgosnell in #2753
- Add FERC60 to data access docs by @e-belfer in #2750
- Don't drop leaf calculation components from calc comps table. by @zaneselvans in #2754
- Clean-up XBRL calculation fixes by @cmgosnell in #2728
- Modernize importlib usage by @zaneselvans in #2759
- Add EIA860, EIA860m, & EIA923 2022 early release data by @aesharpe in #2741
- Fix validation errors for 2022 EIA data by @aesharpe in #2778
- Dagsterize MCOE output tables by @zaneselvans in #2553
- Eia861 2022 by @aesharpe in #2782
- Simplify config reading and path configurations by @rousik in #2640
- Fix
pudl_setup
pudl_in
andpudl_out
args by @bendnorman in #2796 - Apply new naming convention to raw and core intermediate assets by @bendnorman in #2789
- Fix anonymous constraints by @jdangerx in #2795
- Integrate 2022 CEMS data by @e-belfer in #2779
- Use all dimensions in
XbrlCalculationForestFerc1
and exploded tables by @zaneselvans in #2763 - Functional deprish fix by @cmgosnell in #2794
- Stop importing urllib3 Retry from deprecated location by @zaneselvans in #2806
- Allow a mix of Zenodo sandbox & production DOIs by @zaneselvans in #2798
- Update PUDL to pandas 2.0 by @zaneselvans in #2320
- enable FERC explosion tags to be dimension specific by @cmgosnell in #2817
- Add github action usage notebook and use large runner for tox-pytest … by @bendnorman in #2823
- Add Hawaii to CEMS by @e-belfer in #2816
- Update description for small plants table by @aesharpe in #2815
- Add references to CEMS and Dagster in the annual updates docs by @aesharpe in #2814
- Add generator to the retirement_year / _month columns for the monthly… by @aesharpe in #2835
- Simplify & enhance linting by using ruff. by @zaneselvans in #2824
- Fix docker builds by @zaneselvans in #2837
- Skip integraton tests on draft PRs by @jdangerx in #2839
- Convert the FERC exploded forest to a table for readability by @cmgosnell in #2832
- Add assertions and workflow updates to debug ogr2ogr failure. by @zaneselvans in #2849
- Add type hints to helpers and update DBF extraction tests by @zaneselvans in #2841
- Fix missing keywords in archived datasources by @e-belfer in #2851
- Rename s3 bucket by @bendnorman in #2793
- Expand multi-dimensional totals correctly by @jdangerx in #2855
- Update docs to reflect switch from flake8 to ruff by @zaneselvans in #2859
- Add contributors & ORCIDs to metadata by @zaneselvans in #2809
- Remove libabseil version pins since gdal & geopandas have new versions. by @zaneselvans in #2866
- Make leafy balance sheet assets & liabilities data by @cmgosnell in #2805
- 💥 FERC feature branch 💥 : FERC tables post caclulation validation by concat-ing & deduping by @cmgosnell in #2633
- Add PHMSA Natural Gas annual report DOI to datastore. by @zaneselvans in #2884
- Update expected row counts in 2 altered FERC 1 tables. by @zaneselvans in #2885
- preliminary version of standardization of the calc metrics by @cmgosnell in #2880
- Fix inter-table 1:1 dimensions in calculations for XBRL explosion by @e-belfer in #2890
- Dagsterize
mega_generators
andplant_parts_eia
by @katie-lamb in #2714 - Resolve conflicting dependencies by @zaneselvans in #2900
- Fix for harvesting owner utilities by @katie-lamb in #2903
- remove duplication of explosion input table names by @cmgosnell in #2921
- Clean up explosion: add plant function tag by @aesharpe in #2916
- standardize the calc checks for the total to subtotal calcs by @cmgosnell in #2886
- Extract more data from FERC XBRLs and handle that new data in ETL by @jdangerx in #2821
- Remove libsnappy binary dependency by @zaneselvans in #2923
- Pin transitive dependency croniter<2; bump pyarrow to v13 by @zaneselvans in #2928
- Fix too-many-paths error in nightly build by @jdangerx in #2931
- Unpin croniter as package metadata has been fixed. by @zaneselvans in #2933
- Fix small plant_id_ferc1 fail by @cmgosnell in #2935
- New zenodo api bandaid by @jdangerx in #2942
- dagsterification of ferc1_eia by @cmgosnell in #2938
- Parallelize extraction of Excel spreadsheets by @e-belfer in #2943
- Hotfix docker build by setting LD_LIBRARY_PATH by @rousik in #2950
- Bump versions for ruff & black by @zaneselvans in #2952
- FERC1 2022 report year fix by @jdangerx in #2947
- FERC1 2022 by @jdangerx in #2948
- Trivial change by @zaneselvans in #2982
- Increase datasette cloud run memory from 4GB to 32GB by @bendnorman in #2990
- Improve calculation error checking by @zaneselvans in #2915
- Only generate alphanumeric entity IDs in test by @jdangerx in #2993
- Add data maturity for 923m by @aesharpe in #2936
- Fix calculating the report_date in demand_hourly_pa_ferc714 by @rousik in #2999
- Update sources, DOI and copyright link in PUDL by @e-belfer in #3004
- Switch regex strategy to sampling strategy to improve performance by @jdangerx in #2998
- add alembic schema changes for the recent constraint. by @rousik in #3012
- Set up Cloud SQL Postgres database for dagster storage by @bendnorman in #2996
- Add dagster postgres env vars to build-deploy-pudl.yaml by @bendnorman in #3014
- only fix a reporting_frequency_code when the column exists by @cmgosnell in #3013
- update excepted rows for no-fips id-ed respondents but keep annualize… by @cmgosnell in #3023
- Fix validation
test_fbp_ferc1_mismatched_fuels
error by @cmgosnell in #3025 - Deploy Datasette to fly.io instead of Cloud Run by @jdangerx in #3018
- Test that DB schema matches the Alembic migrations by @jdangerx in #3027
- Fix XBRL extraction clobber by @jdangerx in #3026
- Always use tmp path for clobber tests. by @jdangerx in #3043
- Set up reproducible Python environments with conda-lock by @zaneselvans in #2968
- Add EIA860 2022 final release data by @aesharpe in #3040
- Eia861 2022 final release by @aesharpe in #3048
- Bring conda-lock workflows into main by @zaneselvans in #3049
- Spot fix cliffside capacity by @cmgosnell in #3046
- Always take lockfile version from current branch when merging by @zaneselvans in #3056
- Update Lockfile by @github-actions in #3057
- Update PUDL to SQLAlchemy 2.0 by @zaneselvans in #2267
- Pin conda-lock<2.5.0 by @zaneselvans in #3063
- Change lock file merge strategy to look for .yml instead of .yaml files by @bendnorman in #3065
- Fix full build notification logic by @bendnorman in #3058
- Unpin conda lock by @zaneselvans in #3069
- Use ruff instead of black for autoformatting by @zaneselvans in #3060
- Update development setup documentation by @zaneselvans in #3074
- Simplify process of bootstrapping pudl-dev conda env. by @zaneselvans in #3080
- Make nightly build outputs smaller by @zaneselvans in #3084
- Fix the override paths when running in github actions by @rousik in #3045
- Clean up explosion by @e-belfer in #2894
- Add DBF metadata to
electric_plant_depreciation_functional_ferc1
by @e-belfer in #2918 - Update EIA 860M and EIA 861 DOIs by @jdangerx in #3085
- Refactor PUDL to use Pydantic v2 by @zaneselvans in #3051
- Resolve core dump failure by @bendnorman in #3090
- dotenv compatibility by @jdangerx in #3092
- Eia923 2022 final release q4 update Nov 21 by @robertozanchi in #3073
- Dev environment cleanup and documentation tweaks by @zaneselvans in #3093
- Rename dbf-derived FERC SQLite DBs by @zaneselvans in #3094
- Add GHA workflow for release-on-tag by @zaneselvans in #3124
New Contributors
- @knordback made their first contribution in #2200
- @AppTrain made their first contribution in #2592
- @dstansby made their first contribution in #2641
- @robertozanchi made their first contribution in #3073
Full Changelog: v2022.11.30...v2023.12.01