Skip to content

Commit

Permalink
Merge pull request #1125 from catalyst-cooperative/dev
Browse files Browse the repository at this point in the history
Potential PUDL v0.4.0 release
  • Loading branch information
zaneselvans authored Aug 14, 2021
2 parents 1083dac + ced574a commit ed2c449
Show file tree
Hide file tree
Showing 74 changed files with 6,830 additions and 2,069 deletions.
1 change: 1 addition & 0 deletions .github/dependabot.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,5 +10,6 @@ updates:

- package-ecosystem: "github-actions"
directory: "/"
target-branch: "dev"
schedule:
interval: "daily"
6 changes: 3 additions & 3 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ repos:

# Quick content checks based on grepping for python specific patterns:
- repo: https://github.com/pre-commit/pygrep-hooks
rev: v1.8.0
rev: v1.9.0
hooks:
- id: python-check-blanket-noqa # Prohibit overly broad QA exclusions.
- id: python-no-eval # Never use eval() it's dangerous.
Expand Down Expand Up @@ -35,7 +35,7 @@ repos:

# Make sure import statements are sorted uniformly.
- repo: https://github.com/pre-commit/mirrors-isort
rev: v5.8.0
rev: v5.9.3
hooks:
- id: isort

Expand All @@ -59,7 +59,7 @@ repos:

# Check for errors in restructuredtext (.rst) files under the doc hierarchy
- repo: https://github.com/PyCQA/doc8
rev: 0.9.0a1
rev: 0.9.0
hooks:
- id: doc8
args: [--config, tox.ini]
Expand Down
20 changes: 20 additions & 0 deletions CITATION.cff
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
cff-version: 1.1.0
message: "If you use PUDL, please cite it as indicated below."
authors:
- family-names: Selvans
given-names: Zane
orcid: https://orcid.org/0000-0002-9961-7208
- family-names: Gosnell
given-names: Christina
- family-names: Winter
given-names: Steven
- family-names: Dunkle Werner
given-names: Karl
orcid: https://orcid.org/0000-0003-0523-7309
- family-names: Shivley
given-names: Greg
orcid: https://orcid.org/0000-0002-8947-694X
title: "The Public Utility Data Liberation (PUDL) Project"
version: 0.3.2
doi: 10.5281/zenodo.3404014
date-released: 2020-02-17
12 changes: 4 additions & 8 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -20,10 +20,6 @@ The Public Utility Data Liberation Project (PUDL)
:target: https://codecov.io/gh/catalyst-cooperative/pudl
:alt: Codecov Test Coverage

.. image:: https://img.shields.io/codacy/grade/2fead07adef249c08288d0bafae7cbb5
:target: https://app.codacy.com/app/zaneselvans/pudl
:alt: Codacy Grade

.. image:: https://img.shields.io/pypi/v/catalystcoop.pudl
:target: https://pypi.org/project/catalystcoop.pudl/
:alt: PyPI Latest Version
Expand Down Expand Up @@ -59,15 +55,15 @@ PUDL currently integrates data from:
* `EIA Form 860 <https://www.eia.gov/electricity/data/eia860/>`__ (2004-2019)
* `EIA Form 860m <https://www.eia.gov/electricity/data/eia860m/>`__ (2020-2021)
* `EIA Form 861 <https://www.eia.gov/electricity/data/eia861/>`__ (2001-2019)
* `EIA Form 923 <https://www.eia.gov/electricity/data/eia923/>`__ (2009-2019)
* `EIA Form 923 <https://www.eia.gov/electricity/data/eia923/>`__ (2001-2019)
* `EPA Continuous Emissions Monitoring System (CEMS) <https://ampd.epa.gov/ampd/>`__ (1995-2020)
* `FERC Form 1 <https://www.ferc.gov/industries-data/electric/general-information/electric-industry-forms/form-1-electric-utility-annual>`__ (1994-2019)
* `FERC Form 714 <https://www.ferc.gov/industries-data/electric/general-information/electric-industry-forms/form-no-714-annual-electric/data>`__ (2006-2019)
* `US Census Demographic Profile 1 Geodatabase <https://www.census.gov/geographies/mapping-files/2010/geo/tiger-data.html>`__ (2010)

Thanks to support from the `Alfred P. Sloan Foundation Energy & Environment Program
<https://sloan.org/programs/research/energy-and-environment>`__, from 2021 to 2023 we will be
integrating the following data as well:
Thanks to support from the `Alfred P. Sloan Foundation Energy & Environment
Program <https://sloan.org/programs/research/energy-and-environment>`__, from
2021 to 2023 we will be integrating the following data as well:

* `EIA Form 176 <https://www.eia.gov/dnav/ng/TblDefs/NG_DataSources.html#s176>`__
(The Annual Report of Natural Gas Supply and Disposition)
Expand Down
51 changes: 51 additions & 0 deletions devtools/databeta.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
#!/bin/sh
# A script to compile a Dockerized data release based on a user's local PUDL
# data environment.

# Name of the directory to create the data release archive in
RELEASE_DIR=pudl-v0.4.0-2021-07-15
# The PUDL working directory where we'll find the data to archive:
PUDL_IN=$HOME/code/catalyst/pudl-work
# Reference to an existing Docker image to pull
DOCKER_TAG="2021.03.27"

echo "Started:" `date`
# Start with a clean slate:
rm -rf $RELEASE_DIR
mkdir -p $RELEASE_DIR
# The release container / environment is based on the pudl-examples repo:
git clone --depth 1 [email protected]:catalyst-cooperative/pudl-examples.git $RELEASE_DIR
rm -rf $RELEASE_DIR/.git*
# These directories are where the data will go. They're integrated with the
# Docker container that's defined in the pudl-examples repo:
mkdir -p $RELEASE_DIR/pudl_data
mkdir -p $RELEASE_DIR/user_data

# Freeze the version of the Docker container:
cat $RELEASE_DIR/docker-compose.yml | sed -e "s/pudl-jupyter:latest/pudl-jupyter:$DOCKER_TAG/" > $RELEASE_DIR/new-docker-compose.yml
mv $RELEASE_DIR/new-docker-compose.yml $RELEASE_DIR/docker-compose.yml
# Set up a skeleton PUDL environment in the release dir:
pudl_setup $RELEASE_DIR/pudl_data

# These are probably outdated now... see if they fail.
rm -rf $RELEASE_DIR/pudl_data/environment.yml
rm -rf $RELEASE_DIR/pudl_data/notebook

# Copy over all of the pre-processed data
echo "Copying SQLite databases..."
cp -v $PUDL_IN/sqlite/ferc1.sqlite $RELEASE_DIR/pudl_data/sqlite/
cp -v $PUDL_IN/sqlite/pudl.sqlite $RELEASE_DIR/pudl_data/sqlite/
cp -v $PUDL_IN/sqlite/censusdp1tract.sqlite $RELEASE_DIR/pudl_data/sqlite/
echo "Copying Parquet datasets..."
cp -r $PUDL_IN/parquet/epacems $RELEASE_DIR/pudl_data/parquet/

# Save the Docker image as a tarball so it can be archived with the data:
docker save catalystcoop/pudl-jupyter:$DOCKER_TAG -o $RELEASE_DIR/pudl-jupyter.tar

# List the high-level contents of the archive so we can see what it contains:
find $RELEASE_DIR -maxdepth 3

# Create the archive
tar -czf $RELEASE_DIR.tgz $RELEASE_DIR

echo "Finished:" `date`
Original file line number Diff line number Diff line change
Expand Up @@ -62,9 +62,9 @@
"outputs": [],
"source": [
"eia923_tables = pc.pudl_tables['eia923']\n",
"eia923_years = [2018, 2019]\n",
"eia923_years = list(range(2001, 2020))\n",
"eia860_tables = pc.pudl_tables['eia860']\n",
"eia860_years = [2018, 2019]"
"eia860_years = list(range(2004, 2020))"
]
},
{
Expand All @@ -83,6 +83,13 @@
"ds = pudl.workspace.datastore.Datastore(local_cache_path=Path(pudl_settings[\"data_dir\"]))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# EIA-860"
]
},
{
"cell_type": "markdown",
"metadata": {},
Expand All @@ -105,7 +112,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## Extract just the EIA-923"
"## Transform just the EIA-860"
]
},
{
Expand All @@ -115,15 +122,22 @@
"outputs": [],
"source": [
"%%time\n",
"eia923_extractor = pudl.extract.eia923.Extractor(ds)\n",
"eia923_raw_dfs = eia923_extractor.extract(year=eia923_years)"
"eia860_transformed_dfs = pudl.transform.eia860.transform(\n",
" eia860_raw_dfs, eia860_tables=eia860_tables)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Transform just the EIA-860"
"# EIA-923"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Extract just the EIA-923"
]
},
{
Expand All @@ -133,8 +147,8 @@
"outputs": [],
"source": [
"%%time\n",
"eia860_transformed_dfs = pudl.transform.eia860.transform(\n",
" eia860_raw_dfs, eia860_tables=eia860_tables)"
"eia923_extractor = pudl.extract.eia923.Extractor(ds)\n",
"eia923_raw_dfs = eia923_extractor.extract(year=eia923_years)"
]
},
{
Expand All @@ -155,6 +169,13 @@
" eia923_raw_dfs, eia923_tables=eia923_tables)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Combined EIA Data"
]
},
{
"cell_type": "markdown",
"metadata": {},
Expand Down Expand Up @@ -223,7 +244,7 @@
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
Expand All @@ -237,7 +258,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.4"
"version": "3.9.6"
}
},
"nbformat": 4,
Expand Down
7 changes: 0 additions & 7 deletions docs/api/pudl.output.glue.rst

This file was deleted.

1 change: 0 additions & 1 deletion docs/api/pudl.output.rst
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,6 @@ Submodules
pudl.output.epacems
pudl.output.ferc1
pudl.output.ferc714
pudl.output.glue
pudl.output.pudltabl

Module contents
Expand Down
8 changes: 7 additions & 1 deletion docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
# -- Project information -----------------------------------------------------

project = 'PUDL'
copyright = '2016-2021, Catalyst Cooperative' # noqa: A001
copyright = '2016-2021, Catalyst Cooperative, CC-BY-4.0' # noqa: A001
author = 'Catalyst Cooperative'

# -- General configuration ---------------------------------------------------
Expand All @@ -38,9 +38,15 @@
'sphinx.ext.todo',
'sphinx.ext.viewcode',
'sphinx_issues',
'sphinx_reredirects',
]
todo_include_todos = True

# Redirects to keep folks from hitting 404 errors:
redirects = {
"data_dictionary": "data_dictionaries/pudl_db.html",
}

# GitHub repo
issues_github_path = "catalyst-cooperative/pudl"

Expand Down
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -5,12 +5,6 @@ FERC Form 1 Data Dictionary
We have mapped the Visual FoxPro DBF files to their corresponding FERC Form 1
database tables and provided a short description of the contents of each table here.

* :download:`A diagram of the 2015 FERC Form 1 Database (PDF)
<ferc1/ferc1_db_diagram_2015.pdf>`
* :download:`Blank FERC Form 1 (PDF, to 2014-12-31) <ferc1/ferc1_blank_2014-12-31.pdf>`
* :download:`Blank FERC Form 1 (PDF, to 2019-12-31) <ferc1/ferc1_blank_2019-12-31.pdf>`
* :download:`Blank FERC Form 1 (PDF, to 2022-11-30) <ferc1/ferc1_blank_2022-11-30.pdf>`

.. note::

* The Table Names link to the contents of the database table on our `FERC Form 1
Expand All @@ -24,6 +18,6 @@ database tables and provided a short description of the contents of each table h
Quarterly. A/Q if the data is reported both annually and quarterly.

.. csv-table::
:file: ferc1/ferc1_db_notes.csv
:file: ferc1_db.csv
:header-rows: 1
:widths: auto
16 changes: 16 additions & 0 deletions docs/data_dictionaries/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
.. _data-dictionaries:

Data Dictionaries
=================

.. toctree::
:caption: Data Processed & Cleaned by PUDL
:maxdepth: 1

pudl_db

.. toctree::
:caption: Raw, Unprocessed Data
:maxdepth: 1

ferc1_db
Loading

0 comments on commit ed2c449

Please sign in to comment.