Skip to content

v0.9.2

Compare
Choose a tag to compare
@WesIngwersen WesIngwersen released this 28 May 15:00
· 783 commits to master since this release

standardizedinventories-master folder

  1. The setup.py and requirements were modified to include BeautifulSoup, argparse, selenium, regex, PyYAML, and webdriver-manager in the installation requirements.

stewi folder

  1. TRI.py was modified
  2. The following packages were included:
    a. argparse: to write the inputs directly from Windows CMD
    b. beautifulsoup: to retrieve information directly from non-dynamic html
    c. regex: to handle regular expressions
    d. io: to deal with various type of input/output
  3. The files TRI_File_1a_columns.txt and TRI_File_3a_columns.txt were added in stewi/data folder. These include all the names of the columns for TRI Data Plus File 1a and File 3a. In 2018, the TRI program administrators split File 1 in two new ones called File 1a and File 1b. File 1a is the file StEWI needs. I used TRI Fila 3a because it has information about the off-site transfer. I could note that StEWI did not include information about the basis of estimate for off-site land treatment and other off-site land disposals.
  4. TRI_required_fields.txt was modified to add new information
  5. TRI_keys.txt was added in stewi/data folder to include new TRI keys easier.
  6. Files called TRI_chem_release_Year.csv were added to include new validation source
  7. A new file config.yaml was added. This file is useful to handle changes in a web site localization and/or the code source of a website is modified.
  8. To use TRI.py you can navigate to standardizedinventories-master (or stewi) in Windows CMD and write:

At this moment only, there is information about RCRA, TRI, FRS, and SRS.
python “stewi/”TRI.py Option Year -F File1a File2 … FileN

Where the options are A, B, C: A is for extracting files from the TRI Data Plus web site. B for organizing TRI National Totals files from TRI_chem_release_Year.csv (this is expected to be download before and to be held as it is described in TRI.py). C for organizing TRI as required by StEWI

For instance, if you want to use File 1a and File 3a of TRI Data Plus (as it is our case), retrieve information for TRI 2018. Therefore, you write in Windows CMD:

python “stewi/”TRI.py A 2018 -F 1a 3a

After, if you want to create TRI_2018_NationalTotals.csv for validation:

python “stewi/”TRI.py B 2018

The flag -F and the files are not needed, but you need to have TRI_chem_release_2018.csv in the data folder.

Finally, if you want to organize this for StEWI:

python “stewi/”TRI.py C 2018 -F 1a 3a

Note: as you know, TRI may include new columns. Therefore, you only need to add them to TRI_File_1a_columns.txt and TRI_File_3a_columns.txt (or other files you want).

  1. RCRAInfo.py was modified
  2. Selenium package was used to handle dynamic html due to requests, and urllib3 does not control this.
  3. To use RCRAInfo.py, you can navigate to standardizedinventories-master (or stewi) in Windows CMD and write:

python “stewi/”RCRAInfo.py Option Year -T Table1 Table2 … TableN

Where the options are E, O, C: E is for extracting files from the RCRAInfo web site. O for organizing Biennial Report for each year due to the current flat file has information of all year mixed. C for creating the files StEWI needs.

For instance, if you want to retrieve table BR_REPORTING. Therefore, you write in Windows CMD:

python “stewi/” RCRAInfo.py E -T BR_REPORTING

After, if you organize for each year the table BR_REPORTING (This not take data for existing RCRAInfo report)

python “stewi/” RCRAInfo.py O -T BR_REPORTING

Finally, you want to organize Biennial Report 2017 for StEWI

python “stewi/” RCRAInfo.py C 2017

The Flag -T and the files are not needed, but you need to have RCRAInfo_2017_NationalTotals.csv in the data folder.

  1. The RCRAInfo National Totals were obtained from https://rcrapublic.epa.gov/rcrainfoweb/action/modules/br/trends/view
  2. RCRA_FlatFile_LineComponents_2019.csv was added due to changes in the specification of flat files https://rcrainfo.epa.gov/rcrainfo-help/application/publicHelp/index.htm
  3. ValidationSets_Sources.csv file was modified to include the source for TRI National Totals and RCRAInfo National Total. For TRI for the year between 2001 and 2017 and RCRAInfo from 2001 to 2009 and 2017.

chemicalmatcher folder

  1. Some modifications in globals.py, programsynonymlookupbyCAS.py, and writeStEWIchemicalmatchesbyinventory.py were made.

facilitymatcher folder

  1. Some alterations in globals.py, WriteFacilityMatchesforStEWI.py, and WriteFRSNAICSforStEWI.py were made.