-
-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PHMSA transmission part P #3279
Conversation
7b0cf8d
to
8541273
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One question that I need a bit of help with.
other_materials,,,,,,,,,,,,,,,,,,,,,partpothermaterials,partpothermaterials,partpothermaterials,partpothermaterials,partpothermaterials,partpothermaterials,partpothermaterials,partpothermaterials,partpothermaterials,partpothermaterials,partpothermaterials,partpothermaterials,partpothermaterials | ||
gathering_other_materials,,,,,,,,,,,,,,,,,,,,,partpgothermaterials,partpgothermaterials,partpgothermaterials,partpgothermaterials,partpgothermaterials,partpgothermaterials,partpgothermaterials,partpgothermaterials,partpgothermaterials,partpgothermaterials,partpgothermaterials,partpgothermaterials,partpgothermaterials |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So for most of the field names, we have part<letter><t or g or nothing><name>
- so I inferred that these were the "total other materials" and "gathering other materials" columns... but I think it sort of makes sense for these to be the "transmission other materials" & "gathering other materials" columns - to show what those other materials might be. I guess partpgothermaterials
is pretty unambiguously "gathering", so my question is - do we think we should call this total_other_materials
or transmission_other_materials
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ugh that's tough. I think I would go with total_other_materials
because it is not specified.... even though with more context we might learn that this is transmission not total.
It looks like these two are outside of the core table in the PDF version of the table. The description is "Composite pipe requires a PHMSA Special Permit or waiver from a State2 specify Other material(s):"
I would definitely add something about composite
and special_permit
or something to the names of both of these.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I read the PDF as saying two things:
- Composite pipe requires a special permit / waiver
- please specify "other materials" - we'll store this as
partpothermaterials
andpartpgothermaterials
In which case we shouldn't have any composite
/special_permit
in the name.
Part of why I read it that way is that the rest of the form distinguishes between composite
& other
materials, and another is that there's a little superscript (1) by the Composite column header which seems like it would correspond to the first point.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a heads up that when there's a "specify" field I've been calling it "x_y_description" as a column name. (E.g., "gathering_other_materials_description"). But I think Dazhong's reading of the form makes sense to me. When you look at the actual entries in the raw data what kinds of info are you seeing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh yes you are totally right here. good eye. no composite
or special_permit
!
hm okay well that changes my answer to you were originally asking because now we can check if partpototal
shows up when there is any other pipe or just transmission and it looks like there are only values in this partpototal
when there is other transmission miles! so i would go with transmission_other_materials
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sweet, thanks for all the discussion! Going with transmission/gathering_other_materials_description
:)
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #3279 +/- ##
=====================================
Coverage 92.7% 92.7%
=====================================
Files 145 145
Lines 13100 13100
=====================================
Hits 12143 12143
Misses 957 957 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one request to add _miles
suffixes and obvis we should settle on the name of the weird not in the table two you brought up but otherwise this looks grand
oh plus i think you need to add the pr into the release notes
src/pudl/package_data/phmsagas/column_maps/yearly_miles_of_pipe_in_state_by_material.csv
Outdated
Show resolved
Hide resolved
src/pudl/package_data/phmsagas/column_maps/yearly_miles_of_pipe_in_state_by_material.csv
Show resolved
Hide resolved
8541273
to
56e709f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀
172f234
to
aa31bcb
Compare
aa31bcb
to
41b632d
Compare
Overview
Closes #3278.
This maps part P of the PHMSA transmission/gathering pipeline data.
It has state-by-state information about pipeline materials, from 2010 onwards.
I added a column map for
yearly_miles_of_pipe_in_state_by_material
and added that to the various "where to find this data" CSVs.I also made the release notes have a nested list of the small PRs so that we don't have a merge conflict on that same line over and over.
Testing
How did you make sure this worked? How can a reviewer verify this?
I ran the debugging notebook and also materialized the PHMSA assets.
To-do list