Skip to content

Data analysis on Chicago infrastructure and infrastructure spending

License

Notifications You must be signed in to change notification settings

simplicio10/data-analysis

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Chicago Participatory Urbanism

Every year, Chicago alders get $1.5 million to spend at their discretion on capital improvements on their ward through the city's Aldermanic Menu Program. Their spending is publicly available in PDF format in the Chicago Capital Improvement Archive. We are in the process of extracting and geocoding this data.

Check out the GitHub issues for things to work on.

Getting Started/Installing the Repo

  1. Clone the repo.
  2. Run the following command in the terminal:
pip install .

Note: When doing development work on the package, you need to re-run this command to use the latest package changes in external scripts.

The repo has two main parts: the data processing Python package and a library of scripts that use the package. If you're a newcomer, we recommend familiarizing yourself with the project by using the scripts to follow the data processing work flow outlined below.

Data Processing Work Flow

Using the repo scripts, the data processing involves the following steps:

  • Extract data from PDFs
  • Post-process data (name cleanup, field seperation, categorization)
  • Geocode location data
    • Identify location format
    • Parse location into collection of street numbers or street intersections
    • Get GPS coordinates from street numbers and street interesections
    • Combine coordinates into point(s), lines, or polygons
  • Post-process geo-data
    • Interpolate lines and polygons into point clouds for heatmapping

Code Overview

Scripts

  • ward_spending_pdf_data_extraction - converts CIP aldermanic menu spending PDFs into CSVs
  • ward_spending_post_processing - post-processes PDF data, making fixes to columns and categorizing items
  • ward_spending_geocoding - gecodes the CSV data, outputtinga geoJSON

Upcoming Bike Lanes

  • bike_geocoding_script - one-off, uses the ward wise libraries to geocode CDOT upcoming bike lane data

Chicago Participatory Urbanism libraries

  • ward_spending.address_geocoding - use to convert location text into geo-coded geometry data
    • ward_spending.address_format_processing - use to parse location text into street numbers and street intersections
    • geocoder - use to geocode street numbers and street intersections

Ideas

Brainstorming Doc

About

Data analysis on Chicago infrastructure and infrastructure spending

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%