Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate EIA860 2023 Early Release Data #3676

Closed
65 tasks done
aesharpe opened this issue Jun 17, 2024 · 0 comments · Fixed by #3681
Closed
65 tasks done

Integrate EIA860 2023 Early Release Data #3676

aesharpe opened this issue Jun 17, 2024 · 0 comments · Fixed by #3681
Assignees
Labels
data-update When fresh data is integrated into PUDL from quarterly or annual updates eia860 Anything having to do with EIA Form 860

Comments

@aesharpe
Copy link
Member

aesharpe commented Jun 17, 2024

Annual Updates Docs: https://catalystcoop-pudl.readthedocs.io/en/latest/dev/existing_data_updates.html

Tasks

  1. 6 of 6
    data-update eia860
    e-belfer

Extraction Notes

  • There are some new solar and storage columns. I tried to name them but would love input:
    • Most of the boolean columns in the storage tabs use the prefix served_ because EIA's descriptions of them indicate its a bool of whether or not a unit served x/y/z thing. but these new columns have descriptions that are more "indicates whether the storage device is....."
    • Support other units.
  • There is a new proposed storage tab!

Transform Notes

  • There were a small handful of null generator ids. In the ownership table transform, there were two different versions of dealing with the same plant with null generator ids. I condensed these.
  • Question: Are the operational statuses of these solar/storage plants ending up anywhere (presumably in the annual harvested table)? Do we need to add them based on their tab? Answer: There are status columns in these og tabs and that is getting properly slurped into the harvested annual assets.

PUDL ID Mapping Notes

  • I used a very small amount of notebook sleuthing to do a better/quicker/less laggy way to map the plant ID's. I will probably check this in as a little devtools notebook.

Tests & Validation Notes

  • why are there some negative values here? my assumption is that w/ this new 860 ER we are replacing the last 860m from 2023. So we had a chunk of 860m generators, some of which could no longer be in the ER version. This makes sense so far bc it looks like this doesn't happen to the 860 tables that are not included in the 860m data. Will confirm by using the sql differ notebook.
    • okay yes I have confirmed my suspicion via investigating the differences between the old and new _out_eia__yearly_generators. All but one of the newly missing generators (the one with the "nan" generator_id that is now being removed in the transform) is missing from 2023. Weirdly most of them are retired. I spot checked the retired and other tabs in the ER sheets and it seems like these missing guys are truly missing.
image

Validation errors to fix

@aesharpe aesharpe converted this from a draft issue Jun 17, 2024
@aesharpe aesharpe added eia860 Anything having to do with EIA Form 860 data-update When fresh data is integrated into PUDL from quarterly or annual updates labels Jun 17, 2024
@cmgosnell cmgosnell self-assigned this Jun 17, 2024
@cmgosnell cmgosnell moved this from Backlog to In progress in Catalyst Megaproject Jun 18, 2024
@github-project-automation github-project-automation bot moved this from In progress to Done in Catalyst Megaproject Jun 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data-update When fresh data is integrated into PUDL from quarterly or annual updates eia860 Anything having to do with EIA Form 860
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

2 participants