harvey-data-processing

For processing of data in the Hurricane Harvey project

Folder Arrangement (Hyak):

DSSG (I assume ran locally)
a. crop_post_event.py
b. run-all.py
c. test.py
d. loopThroughDir.py
e. script.py
f. tomnod.py
dds
a. Crop.py
b. Crop_image.py
c. loopThroughDir.py
e. anaconda3
f. core.3459
g. crop_pre
h. damage_data.csv
i. non_damage_data.csv
j. postevent
k. preevent
l. readme
m. run.slurm
n. run_all.sh
o. slurm-167806.out
p. tifRange.csv
q. tifRange1.csv
img
a. crop_img.py

Folder Arrangement (GitHub Repo):

cropping
data
downloading
a. parse.py
b. getCatalog.py c. script.py
d. run-all.py
notebooks
referencing
a. loop_through_dir.py
b. tomnod.py
band_sorting a. sort_images.py

Process flow:

Script Documentation

parse.py
This script works to pull html links from Digital Globe's Open Data Program on hurricane Harvey. You must open the script to change the link if you would like to download another event's data.
Input: html link (in script)
Output: txt files with html links inside (each file is one Catalog ID)
getCatalog.py
This script will loop through the txt files that were given as output from parse.py and will get the IDs to be stored in a separate list.
Input: txt files from parse.py
Output: postevent_catalogs.txt (this is an input file for script.py)
script.py
This script will loop through each Catalog ID in the output file from getCatalog.py and write a bash script to download the images into separate folders.
Input: postevent_catalogs.txt, txt files from parse.py
Output: bash scripts (.sh, each script is for one Catalog ID)
run-all.py
This script will create a bash script to run through all the individual bash scripts created by script.py in order to download all the images without the need to individually call all scripts. This script should be run in the outer folder of the directory where you wish the individual folders containing tifs will be saved.
Input: bash scripts from script.py
Output: run-all-post.sh
loop_through_dir.py
This script goes through all the Tif files in order to find the coordinate ranges of each and saves it in a csv file.
Input: tif files in folders
Output: tifRange-post-1.csv
tomnod.py
This script goes through the TomNod GeoJSON file and creates a table that matches the point to the specific tif file that it exists in. This also goes through all the tif files to locate points that did not have the catalog ID indicated.
Input: digitalglobe_crowdsourcing_hurricane_harvey_20170915.geojson, tifRange-post-1.csv
Output: coordinateAndTif.csv
crop_post_event.py
This script goes through the coordinateAndTif.csv reference file to crop the tiff images based on a determined, fixed pixel size.
Input: coordinateAndTif.csv
Output: jpeg images

sort_images.py This script separates the 3-band from the 1-band images and renames the files in a date_catalogid_img name format.
Input: Tiff images in folders
Output: Band Sorted images in folders

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

harvey-data-processing

About

Releases

Packages

Contributors 4

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
band_sorting		band_sorting
cropping		cropping
downloading		downloading
notebooks		notebooks
referencing		referencing
README.md		README.md

DDS-Lab/harvey-data-processing

Folders and files

Latest commit

History

Repository files navigation

harvey-data-processing

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages