This is an ETL process for extracting and publishing data from the City of Philadelphia 311 system.
git clone
this repo- Create a virtualenv, activate, and
pip install -r requirements.txt
- Rename
sample_config.py
toconfig.py
and enter actual values (or download from Lastpass). - Create a batch file to activate the virtualenv
and
python sync.py`. Schedule this to run regularly.
seed.py
is used to truncate the cases table and reload from a CSV dump. The basic usage is:
python seed.py <file>
sync.py
will check the database table for the most recent updated_datetime
and get all records from Salesforce that have been updated since then. For a description of command-line arguments, see python sync.py --help
.
The basic usage is:
python sync.py
If the Salesforce query times out you may have to chunk the updates into individual days. To sync just a single day, use the -d
option:
python sync.py -d 2016-05-18