forked from nasa/GEDI-Data-Resources
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathGEDI_Finder_Tutorial_R.rmd
155 lines (112 loc) · 9.25 KB
/
GEDI_Finder_Tutorial_R.rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
---
title: "Spatial Querying of GEDI Version 2 Data in R"
output:
html_document:
df_print: paged
---
The Global Ecosystem Dynamics Investigation ([GEDI](https://lpdaac.usgs.gov/data/get-started-data/collection-overview/missions/gedi-overview/)) mission aims to characterize ecosystem structure and dynamics to enable radically improved quantification and understanding of the Earth's carbon cycle and biodiversity. The Land Processes Distributed Active Archive Center (LP DAAC) distributes the GEDI Level 1 and Level 2 Version 1 and Version 2 products. The LP DAAC created the GEDI Finder _Web Service_ to allow users to perform spatial queries of GEDI _Version 1_ L1-L2 full-orbit granules. One of the updates for GEDI _Version 2_ included additional spatial metadata that allows users to perform spatial queries via a graphical user interface (GUI) using NASA's [Earthdata Search](https://search.earthdata.nasa.gov/search) or programmatically using NASA's [Common Metadata Repository](https://cmr.earthdata.nasa.gov/search) (CMR). Another update is that each GEDI V1 full-orbit granule has been divided into 4 sub-orbit granules in V2.
### The objective of this tutorial is to demonstrate how current GEDI Finder users can update their workflow for GEDI Version 2 (V2) data using NASA's CMR to perform spatial [bounding box] queries for GEDI V2 L1B, L2A, and L2B data, and how to reformat the CMR response into a list of links that will allow users to download the intersecting GEDI V2 sub-orbit granules directly from the LP DAAC Data Pool.
## Use Case Example:
This tutorial was developed using an example use case for a current GEDI Finder user who has been using the GEDI Finder web service in R to find intersecting GEDI L2A Version 1 full-orbit granules over the Amazon Rainforest. The user is now looking to use the same workflow to find intersecting GEDI L2A V2 sub-orbit granules.
This tutorial will show how to use R to perform a spatial query for GEDI V2 data using NASA's CMR, how to reformat the CMR response into a list of links pointing to the intersecting sub-orbit granules on the LP DAAC Data Pool, and how to export the list of links as a text file.
***
## Applicable Data Products:
### This tutorial can be used to perform spatial queries on the following products:
- **GEDI L1B Geolocated Waveform Data Global Footprint Level - [GEDI01_B.002](https://doi.org/10.5067/GEDI/GEDI01_B.002)**
- **GEDI L2A Elevation and Height Metrics Data Global Footprint Level - [GEDI02_A.002](https://doi.org/10.5067/GEDI/GEDI02_A.002)**
- **GEDI L2B Canopy Cover and Vertical Profile Metrics Data Global Footprint Level - [GEDI02_B.002](https://doi.org/10.5067/GEDI/GEDI02_B.002)**
***
# Topics Covered:
1. [**Import Packages**](#importpackages)
2. [**Define Function to Query CMR**](#definefunction)
3. [**Execute GEDI_Finder Function**](#executefunction)
4. [**Export Results**](#exportresults)
***
# Before Starting this Tutorial:
## Setup and Dependencies
This tutorial is written as an R Markdown Notebook. In order to execute the tutorial, users will need to have R/RStudio installed, including the required packages to execute an R Markdown notebook. `httr` is the only package used in this tutorial.
#### Having trouble getting set up? Contact LP DAAC User Services at: https://lpdaac.usgs.gov/lpdaac-contact-us/
***
## Source Code used to Generate this Tutorial:
#### The repository containing the files is located at: https://git.earthdata.nasa.gov/projects/LPDUR/repos/gedi-finder-tutorial-r/browse
- [R Notebook](https://git.earthdata.nasa.gov/projects/LPDUR/repos/gedi-finder-tutorial-r/browse/GEDI_Finder_Tutorial_R.rmd)
#### If you prefer to execute the code used in this tutorial outside of a Notebook, a simple R script version is available:
- [R Script](https://git.earthdata.nasa.gov/projects/LPDUR/repos/gedi-finder-tutorial-r/browse/GEDI_Finder.R)
***
# 1. Import Packages <a id="importpackages"></a>
The only package used in this tutorial is `httr`.
```{r}
# Check for required packages, install if not previously installed
if ("httr" %in% rownames(installed.packages()) == FALSE) { install.packages("httr")}
# Import Packages
library(httr)
```
# 2. Define Function to Query CMR <a id="definefunction"></a>
In the code cell below, define a function called `gedi_finder` that takes two *user-submitted* input values, a `product` and a `bbox`.
There are three available products for this function, including 'GEDI01_B.002', 'GEDI02_A.002' and 'GEDI02_B.002'. A list is set up to relate each product `shortname.version` to its associated `concept_id`, which is a value used by NASA's CMR to retrieve data for a specific product. Additional information on concept ID's can be found in the [CMR Search API Documentation](https://cmr.earthdata.nasa.gov/search/site/docs/search/api.html#c-concept-id).
The second user-submitted input value, `bbox` is a string of bounding box coordinate values (decimal degrees) in the following format:
Lower Left Longitude, Lower Left Latitude, Upper Right Longitude, Upper Right Latitude ("LLLon,LLLat,URLon,URLat")
> #### Example: `'-73.65,-12.64,-47.81,9.7'`
```{r}
# Define Function to Query CMR
gedi_finder <- function(product, bbox) {
# Define the base CMR granule search url, including LPDAAC provider name and max page size (2000 is the max allowed)
cmr <- "https://cmr.earthdata.nasa.gov/search/granules.json?pretty=true&provider=LPCLOUD&page_size=2000&concept_id="
# Set up list where key is GEDI shortname + version and value is CMR Concept ID
concept_ids <- list('GEDI01_B.002'='C2142749196-LPCLOUD',
'GEDI02_A.002'='C2142771958-LPCLOUD',
'GEDI02_B.002'='C2142776747-LPCLOUD')
# CMR uses pagination for queries with more features returned than the page size
page <- 1
bbox <- sub(' ', '', bbox) # Remove any white spaces
granules <- list() # Set up a list to store and append granule links to
# Send GET request to CMR granule search endpoint w/ product concept ID, bbox & page number
cmr_response <- GET(sprintf("%s%s&bounding_box=%s&pageNum=%s", cmr, concept_ids[[product]],bbox,page))
# Verify the request submission was successful
if (cmr_response$status_code==200){
# Send GET request to CMR granule search endpoint w/ product concept ID, bbox & page number, format return as a list
cmr_url <- sprintf("%s%s&bounding_box=%s&pageNum=%s", cmr, concept_ids[[product]],bbox,page)
cmr_response <- content(GET(cmr_url))$feed$entry
# If 2000 features are returned, move to the next page and submit another request, and append to the response
while(length(cmr_response) %% 2000 == 0){
page <- page + 1
cmr_url <- sprintf("%s%s&bounding_box=%s&pageNum=%s", cmr, concept_ids[[product]],bbox,page)
cmr_response <- c(cmr_response, content(GET(cmr_url))$feed$entry)
}
# CMR returns more info than just the Data Pool links, below use for loop to grab each DP link, and add to list
for (i in 1:length(cmr_response)) {
granules[[i]] <- cmr_response[[i]]$links[[1]]$href
}
# Return the list of links
return(granules)
} else {
# If the request did not complete successfully, print out the response from CMR
print(content(GET(sprintf("%s%s&bounding_box=%s&pageNum=%s", cmr, concept_ids[[product]],bbox,page)))$errors)
}
}
```
The function returns a list of links to download the intersecting GEDI sub-orbit V2 granules directly from the LP DAAC's Data Pool.
# 3. Execute GEDI Finder Function <a id="executefunction"></a>
### Below is a demonstration of how to set the two required inputs to the `gedi_finder` function to variables.
```{r}
# User-provided inputs (UPDATE FOR YOUR DESIRED PRODUCT AND BOUNDING BOX REGION OF INTEREST)
product <- 'GEDI02_B.002' # Options include 'GEDI01_B.002', 'GEDI02_A.002', 'GEDI02_B.002'
bbox <- '-73.65,-12.64,-47.81,9.7' # bounding box coords in LL Longitude, LL Latitude, UR Longitude, UR Latitude format
```
Above, the variables are defined to query the `GEDI02_B.002` product for a bounding box covering the Amazon Rainforest.
Next, call the `gedi_finder` function for the desired product and bounding box region of interest defined above, and set the output to a variable.
```{r}
# Call the gedi_finder function using the user-provided inputs
granules <- gedi_finder(product, bbox)
print(sprintf("%s %s Version 2 granules found.", length(granules), product))
```
Notice the print statement above will notify you how many granules intersected your bounding box for the product requested.
# 4. Export Results <a id="exportresults"></a>
Below is a demonstration of how to take the `granules` list of Data Pool links for intersecting GEDI V2 granules and export as a text file. The text file will be written to your current working directory, and will be named based on the date and time that the file was created.
```{r}
# Export Results
# Set up output textfile name using the current datetime
outName <- sprintf("%s_GranuleList_%s.txt", sub('.002', '_002', product), format(Sys.time(), "%Y%m%d%H%M%S"))
# Save to text file in current working directory
write.table(granules, outName, row.names = FALSE, col.names = FALSE, quote = FALSE, sep='\n')
print(sprintf("File containing links to intersecting %s Version 2 data has been saved to: %s/%s", product, getwd(), outName))