glam_data_processing

This module is designed to facilitate interaction with the various data products used in the GLAM system, both Normalized Difference Vegetation Index (NDVI) products and the "ancillary" data products that fall outside the core NDVI functionality. These products can be downloaded and processed by glam_data_processing.

Motivation

glam_data_processing was developed as part of the GLAM system's move to the cloud in 2019-2020. It provides a programmatic way to pull and ingest the necessary ancillary data, and offers reliable interaction with the AWS portion of the system.

When working with the volume of data that we are, it's vital to have a re-usable, general engine for downloading imagery, uploading it to the cloud, and extracting the relevant statistics. This module provides all of that functionality for both the NDVI and ancillary data products.

Features

In Python

Downloading

The Downloader class can be used to pull any available data product, whether from its source (NASA, Copernicus, etc.) or from the GLAM AWS S3 bucket. The Downloader.pull() method allows quick and easy retrieval of image files. The resulting file name is automatically formatted for use with Image objects (see below), allowing for efficient automation.

Downloading of NDVI products relies on the octvi package.

Ingestion and Statistics Generation

This module offers two classes used for image ingestion and stats generation: one for use with ancillary data products, and one for use with NDVI products. Instances are initialized by providing the path to a well-named image on disk (e.g. Image("C:/swi.2019-01-01.tif")). Both classes inherit from the generic Image class, and thus share common attributes and methods.

Image.ingest() performs database ingestion and S3 uploading for the given image. If this method is successfully executed, the file will be available for display in the GLAM system, and custom statistics generation can be performed. Regional cached statistics, however, are not generated by this method.

Image.uploadStats() extracts and uploads regional statistics for the image, making them available for retrieval from the GLAM statistics database. Note that the image will not be visible through the GLAM system unless successfully ingested (see Image.ingest() above).

Ancillary Ingestion

Handling of ancillary products (CHIRPS rainfall, MERRA-2 temperature, and Soil Water Index) should be done through the AncillaryImage class. When an instance of this class is successfully initialized (by passing the constructor the full path to the file on disk), the ingest() and uploadStats() methods will be available (see above).

Date format for ancillary files is "%Y-%m-%d"; e.g. "2019-01-01".

NDVI Ingestion

Handling of NDVI products (M*D09Q1, M*D13Q1, etc) should be done through the ModisImage class. When an instance of this class is successfully initialized (by passing the constructor the full path to the file on disk), the ingest() and uploadStats() methods will be available (see above).

Date format for NDVI files is "%Y.%j"; e.g. "2019.001".

From the Command Line

It is possible to set credentials and update all data streams from the command line.

Setting Credentials

The glamconfigure script prompts the user to configure their credentials for the GLAM database and for the two password-protected data archives (MERRA-2 and Copernicus) in the data stream. These credentials are written to a json file that glam_data_processing reads upon load.

Updating Data

The glamupdatedata script is an all-in-one tool for ensuring that the GLAM archive is up to date. The script finds all available missing files, downloads them, ingests them and calculates statistics on them, and then deletes the files on disk. This script can be put into a cron job to keep the data pool as up-to-date as possible.

Code Example

import glam_data_processing as glam # helpful to provide a shorter name

# The ToDoList class searches the database for missing files,
# and also creates new database records for all potentially
# available files between the current date and the last record
# for each product type.

# create a ToDoList object
toDo = glam.ToDoList() 

# filter out unavailable files; leave only the dates that have
# available imagery
toDo.filterUnavailabe() 

# create a Downloader object
downloader = glam.Downloader()

# iterate over ToDoList
for t in toDo: # yields tuples of the form ("PRODUCT","DATE")
	files = downloader.pullFromSource(*t, "C:/temp") # returns tuple of filepaths
	for f in files:
		# Use either ModisImage or AncillaryImage to ingest the data
		# and generate regional statistics
		img = glam.getImageType(f)(f) # create correct Image object type
		img.ingest() # ingest image to S3 bucket and database
		img.uploadStats() # upload image statistics

License

MIT License

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Name		Name	Last commit message	Last commit date
Latest commit History 265 Commits
glam_data_processing		glam_data_processing
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
setup.py		setup.py
test.py		test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

glam_data_processing

Motivation

Features

In Python

Downloading

Ingestion and Statistics Generation

Ancillary Ingestion

NDVI Ingestion

From the Command Line

Setting Credentials

Updating Data

Code Example

License

About

Releases

Packages

Languages

License

fdfoneill/glam_data_processing

Folders and files

Latest commit

History

Repository files navigation

glam_data_processing

Motivation

Features

In Python

Downloading

Ingestion and Statistics Generation

Ancillary Ingestion

NDVI Ingestion

From the Command Line

Setting Credentials

Updating Data

Code Example

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages