Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge dev into main for 2023-11-09 #3031

Merged
merged 63 commits into from
Nov 9, 2023
Merged

Merge dev into main for 2023-11-09 #3031

merged 63 commits into from
Nov 9, 2023

Conversation

zaneselvans
Copy link
Member

  • First pass of integrating the monthly EIA923 data into the rest of the EIA data. This includes updating the package data to account for the 2023 year and updating the way to assign data maturities to 923 data. This also updates some of the expected row counts for the data. It should still fail on the gen_eia923 table because the row count was going down which doesn't seem right. There are also some failures related to check_date_freq as there are now less than 12 months expected in a given round of updates. Will handle those errors in another commit.
  • remove breakpoint
  • Add function to drop ytd records for annual tables
  • Adjust monthly row expectations for gf and frc tables after dropping ytd values for annual tables
  • Tweak the way we add data maturity to the eia923 monthly files and remove double returns from the drop_ytd_for_annual_tables function
  • Litle updates: - Add a note about how the plants are getting dropped in the gen_eia923 output table and link to the issue.
  • For now, comment out the checks that make sure we have the same years of EIA923 and EIA860 data. This is causing issues for the monthly EIA923 data that gets integrated ahead of any available 860 data. This might cause issues elsewhere which is why I haven't committed to fully deleting it yet.
  • Update min max rows
  • Add data_maturity field to harvested EIA tables so that we can drop ytd records from annual EIA tables
  • Address PR comments: - Restructure the way that the data_maturity field is dropped from certain tables when merging multiple tables together that each have that field. Previously it was ad-hoc, now it just gets dropped in the denorm_by_plant function.
  • Fix release note trailing whitespace error
  • Update test_eia923_dependency function to make sure some 860 and 923 years overlap but don't need to be the same
  • Only generate alphanumeric entity IDs in test - non-printable characters seem to break groupby. (Only generate alphanumeric entity IDs in test #2993)
  • Set up Cloud SQL postgres database for dagster storage
  • Copy dagster.yaml after DAGSTER_HOME is created
  • Add proper quoting rules to DAGSTER_PG_PASSWORD secret
  • Use max cpus for nightly builds and spin dagster-storage SQL instance up and down
  • Create and delete Cloud SQL db during nightly builds
  • Set PUDL_SETTINGS_YML to etl_full.yml and add git sha to Cloud SQL database name
  • Add short github ref to database name
  • Update DAGSTER_PG_DB with short git sha
  • Update date range for nightly build links to include 2022
  • Update 923 settings files to accomodate 2023 data and update settings tests so that they aren't dependent on having the same years of EIA923 and EIA860 data
  • Fix calculating the report_date in demand_hourly_pa_ferc714
  • Require non-null report_date in FERC 714 hourly demand table.
  • Update date validation function to only look at instances where data_maturity is not ytd_incremental
  • Remove Cloud SQL lifecycle management from gcp_pudl_etl.sh script
  • Update data contributors, add zenodo role and doi field, update US copyright link
  • Update to ZenodoDoi class, update to https
  • Remove leftover string
  • Switch regex strategy to sampling strategy to improve performance (Switch regex strategy to sampling strategy to improve performance #2998)
  • add alembic schema changes for the recent constraint.
  • only fix a reporting_frequency_code when the column exists
  • Update responses requirement from <0.24,>=0.14 to >=0.14,<0.25
  • Update pyarrow requirement from <14,>=13 to >=13,<15
  • Update dagster-postgres requirement
  • [pre-commit.ci] pre-commit autoupdate
  • update tox and eia923 rows
  • update excepted rows for no-fips id-ed respondents but keep annualized demand
  • add report year validation test
  • add minmax rows into validation test for chonky table
  • idk exactly why the "nan"s began existing but this fixes it
  • revert the replace of "nan" by stopping introducing them! plus some light clean up
  • REALLY REALLY its a nullable string

aesharpe and others added 30 commits October 12, 2023 10:09
…e EIA data. This includes updating the package data to account for the 2023 year and updating the way to assign data maturities to 923 data. This also updates some of the expected row counts for the data. It should still fail on the gen_eia923 table because the row count was going down which doesn't seem right. There are also some failures related to check_date_freq as there are now less than 12 months expected in a given round of updates. Will handle those errors in another commit.
…move double returns from the drop_ytd_for_annual_tables function
- Add a note about how the plants are getting dropped in the gen_eia923 output table and link to the issue.

- Update the way we tell whether an EIA923 filing is monthly or annual based on feedback in the PR
… of EIA923 and EIA860 data. This is causing issues for the monthly EIA923 data that gets integrated ahead of any available 860 data. This might cause issues elsewhere which is why I haven't committed to fully deleting it yet.
- Restructure the way that the data_maturity field is dropped from certain tables when merging multiple tables together that each have that field. Previously it was ad-hoc, now it just gets dropped in the denorm_by_plant function.

- This also entails changing how the data_maturity field gets passed through to the agg tables: adds the data_maturity field to the agg function, selecting the 'first' instance of the data_maturity per agg because the fields are aggregated by date which is how data_maturity is determined. The annual aggregations drop the ytd rows before the aggregation happens so taking the first data_maturity value per year works in this case.

- Remove some comment fields

- Add new migration
… tests so that they aren't dependent on having the same years of EIA923 and EIA860 data
e-belfer and others added 27 commits November 2, 2023 12:44
Update sources, DOI and copyright link in PUDL
)

* Switch regex strategy to sampling strategy to improve performance

* Increase deadline
add alembic migration for the report_date non-null constraint that was recently added
…gres

Set up Cloud SQL Postgres database for dagster storage
Updates the requirements on [responses](https://github.com/getsentry/responses) to permit the latest version.
- [Release notes](https://github.com/getsentry/responses/releases)
- [Changelog](https://github.com/getsentry/responses/blob/master/CHANGES)
- [Commits](getsentry/responses@0.14.0...0.24.0)

---
updated-dependencies:
- dependency-name: responses
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
Updates the requirements on [pyarrow](https://github.com/apache/arrow) to permit the latest version.
- [Commits](apache/arrow@go/v13.0.0...go/v14.0.0)

---
updated-dependencies:
- dependency-name: pyarrow
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
Updates the requirements on [dagster-postgres](https://github.com/dagster-io/dagster) to permit the latest version.
- [Release notes](https://github.com/dagster-io/dagster/releases)
- [Changelog](https://github.com/dagster-io/dagster/blob/master/CHANGES.md)
- [Commits](https://github.com/dagster-io/dagster/commits)

---
updated-dependencies:
- dependency-name: dagster-postgres
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
…/responses-gte-0.14-and-lt-0.25

Update responses requirement from <0.24,>=0.14 to >=0.14,<0.25
…/pyarrow-gte-13-and-lt-15

Update pyarrow requirement from <14,>=13 to >=13,<15
…/dagster-postgres-gte-0.21.5-and-lt-0.21.7

Update dagster-postgres requirement from <0.21.6,>=0.21.5 to >=0.21.5,<0.21.7
updates:
- [github.com/astral-sh/ruff-pre-commit: v0.1.3 → v0.1.4](astral-sh/ruff-pre-commit@v0.1.3...v0.1.4)
…te-config

[pre-commit.ci] pre-commit autoupdate
…uency_code

only fix a reporting_frequency_code when the column exists
…te_fix

update excepted rows for no-fips id-ed respondents but keep annualize…
…at_nan

Fix validation `test_fbp_ferc1_mismatched_fuels` error
Copy link

codecov bot commented Nov 9, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (bbd82ba) 88.6% compared to head (a2bdffa) 88.7%.
Report is 18 commits behind head on main.

Additional details and impacted files
@@          Coverage Diff          @@
##            main   #3031   +/-   ##
=====================================
  Coverage   88.6%   88.7%           
=====================================
  Files         91      91           
  Lines      10991   11011   +20     
=====================================
+ Hits        9749    9769   +20     
  Misses      1242    1242           
Files Coverage Δ
src/pudl/analysis/allocate_gen_fuel.py 91.3% <ø> (ø)
src/pudl/analysis/classify_plants_ferc1.py 92.5% <100.0%> (+<0.1%) ⬆️
src/pudl/extract/eia923.py 100.0% <100.0%> (ø)
src/pudl/extract/excel.py 96.3% <100.0%> (-0.5%) ⬇️
src/pudl/metadata/classes.py 86.5% <100.0%> (+<0.1%) ⬆️
src/pudl/metadata/constants.py 100.0% <ø> (ø)
src/pudl/metadata/fields.py 100.0% <ø> (ø)
src/pudl/metadata/resources/eia.py 100.0% <ø> (ø)
src/pudl/metadata/resources/eia923.py 100.0% <ø> (ø)
src/pudl/metadata/sources.py 100.0% <ø> (ø)
... and 9 more

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@jdangerx jdangerx merged commit a2bdffa into main Nov 9, 2023
13 checks passed
@zaneselvans zaneselvans deleted the nightly-build-2023-11-09 branch December 25, 2023 17:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

7 participants