This lesson will explore how to find and download large gridded datasets via the R package `geoknife`. The package was created to allow easy access to data stored in the [Geo Data Portal (GDP)](https://cida.usgs.gov/gdp/), or any gridded dataset available through the [OPeNDAP](https://www.opendap.org/) protocol DAP2. `geoknife` refers to the gridded dataset as the `fabric`, the spatial feature of interest as the `stencil`, and the subset algorithm parameters as the `knife` (see below).
-![geoknife terminology figure](../static/img/geoknife_summary.png "figure illustrating definitions of fabric, stencil, and knife")
![geoknife terminology figure](../static/img/geoknife_summary.png#inline-img "figure illustrating definitions of fabric, stencil, and knife")
## Lesson Objectives
author: Lindsay R. Carr
date: 9999-10-01
slug: geoknife-intro
title: geoknife - Introduction
draft: true
image: img/main/intro-icons-300px/r-logo.png
parent: Introduction to USGS R Packages
weight: 2
-draft: true
Lesson Summary
This lesson will explore how to find and download large gridded datasets via the R package `geoknife`. The package was created to allow easy access to data stored in the [Geo Data Portal (GDP)](https://cida.usgs.gov/gdp/), or any gridded dataset available through the [OPeNDAP](https://www.opendap.org/) protocol DAP2. `geoknife` refers to the gridded dataset as the `fabric`, the spatial feature of interest as the `stencil`, and the subset algorithm parameters as the `knife` (see below).
-![geoknife terminology figure](../static/img/geoknife_summary.png "figure illustrating definitions of fabric, stencil, and knife")
![geoknife terminology figure](../static/img/geoknife_summary.png#inline-img "figure illustrating definitions of fabric, stencil, and knife")
Lesson Objectives
+title: "sbtools - Data discovery"
+date: "9999-07-01"
+author: "Lindsay R. Carr"
+slug: "sbtools-discovery"
+image: "img/main/intro-icons-300px/r-logo.png"
+output: USGSmarkdowntemplates::hugoTraining
+parent: Introduction to USGS R Packages
+weight: 2
+draft: true
+```{r setup, include=FALSE, warning=FALSE, message=FALSE}
+knit_hooks$set(plot=function(x, options) {
+ sprintf("",
+ options$fig.path, options$label, options$fig.cur, options$fig.ext, options$fig.cap)
+ echo=TRUE,
+ fig.path="static/sbtools-discovery/",
+ fig.width = 6,
+ fig.height = 6,
+ fig.cap = "TODO"
+Although ScienceBase is a great platform for uploading and storing your data, you can also use it to find other available data. You can do that manually by searching using the ScienceBase web interface or through `sbtools` functions.
+## Discovering data via web interface
+The most familiar way to search for data would be to use the ScienceBase search capabilities available online. You can search for any publically available data in the [ScienceBase catalog](https://www.sciencebase.gov/catalog/). Search by category (map, data, project, publication, etc), topic-based tags, or location; or search by your own key words.
+![ScienceBase Catalog Homepage](../static/img/sb_catalog_search.png#inline-img "search ScienceBase catalog")
+Learn more about the [catalog search features](www.sciencebase.gov/about/content/explore-sciencebase#2. Search ScienceBase) and explore the [advanced searching capabilities](www.sciencebase.gov/about/content/sciencebase-advanced-search) on the ScienceBase help pages.
+## Discovering data via sbtools
+The ScienceBase search tools can be very powerful, but lack the ability to easily recreate the search. If you want to incorporate dataset queries into a reproducible workflow, you can script them using the `sbtools` query functions. The terminology differs from the web interface slightly. Below are functions available to query the catalog:
+1. `query_sb_text` (matches title or description)
+2. `query_sb_doi` (use a DOI identifier)
+3. `query_sb_spatial` (data within or at a specific location)
+4. `query_sb_date` (items within time range)
+5. `query_sb_datatype` (type of data, not necessarily file type)
+6. `query_sb` (generic SB query)
+These functions take a variety of inputs, and all return an R list of `sbitems` (a special `sbtools` class). All of these functions default to 20 returned search results, but you can change that by specifying the argument `limit`. The `query_sb` is a generalization of the other functions, and has a number of additional query specifications: [Lucene query string](http://www.lucenetutorial.com/lucene-query-syntax.html), folder and parent items, item ids, or project status. Before we practice using these functions, make sure you load the `sbtools` package in your current R session.
+```{r load, message=FALSE, warning=FALSE}
+### Using `query_sb_text`
+`query_sb_text` returns a list of `sbitems` that match the title or description fields. Use it to search authors, station names, rivers, states, etc.
+```{r query_sb_text}
+# search using a contributors name
+contrib_results <- query_sb_text("Robert Hirsch")
+head(contrib_results, 2)
+# search using place of interest
+park_results <- query_sb_text("Yellowstone")
+head(park_results, 2)
+# search using a river
+river_results <- query_sb_text("Rio Grande")
+head(river_results, 2)
+It might be easier to look at the results returned from queries by just looking at their titles. The other information stored in an sbitem is useful, but a little distracting when you are looking at many results. You can use `sapply` to extract the titles.
+```{r query_sb-sapply}
+# look at all titles returned from the site location query previously made
+sapply(river_results, function(item) item$title)
+Now you can use `sapply` to look at the titles for your returned searches instead of `head`.
+### Using `query_sb_doi`
+Use a Digital Object Identifier (DOI) to query ScienceBase. This should return only one list item, unless there is more than one ScienceBase item referencing this very unique identifier.
+```{r query_sb_doi}
+# USGS Microplastics study
+# Environmental Characteristics data
+### Using `query_sb_spatial`
+`query_sb_spatial` accepts 3 different methods for specifying a spatial area in which to look for data. To illustrate the methods, we are going to use the spatial extents of the Appalachian Mountains and the Continental US.
+```{r query_sb_spatial}
+appalachia <- data.frame(
+ lat = c(34.576900, 36.114974, 37.374456, 35.919619, 39.206481),
+ long = c(-84.771119, -83.393990, -81.256731, -81.492395, -78.417345))
+conus <- data.frame(
+ lat = c(49.078148, 47.575022, 32.914614, 25.000481),
+ long = c(-124.722111, -67.996898, -118.270335, -80.125804))
+# verifying where points are supposed to be
+points(conus$long, conus$lat, col="red", pch=20)
+points(appalachia$long, appalachia$lat, col="green", pch=20)
+The first way to query spatially is by specifying a bounding box `bbox` as an `sp` spatial data object. Visit the [`sp` package documentation](https://cran.r-project.org/web/packages/sp/vignettes/intro_sp.pdf) for more information on spatial data objects.
+```{r query_sb_spatial-bbox}
+# query by bounding box
+ sp::SpatialPoints(appalachia, proj4string =
+ sp::CRS("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")))
+Alternatively, you can supply a vector of latitudes and a vector of longitudes using `lat` and `long` arguments. The function will automatically use the minimum and maximum from those vectors to construct a boundary box.
+```{r query_sb_spatial-latlong}
+# query by latitude and longitude vectors
+query_sb_spatial(long = appalachia$long, lat = appalachia$lat)
+query_sb_spatial(long = conus$long, lat = conus$lat)
+The last way to represent a spatial region to query ScienceBase is using a POLYGON Well-known text (WKT) object as a text string. The format is `"POLYGON(([LONG1 LAT1], [LONG2 LAT2], [LONG3 LAT3]))"`, where `LONG#` and `LAT#` are longitude and latitude pairs as decimals. See the [Open Geospatial Consortium WKT standard](http://www.opengeospatial.org/standards/wkt-crs) for more information.
+```{r query_sb_spatial-wkt}
+# query by WKT polygon
+wkt_coord_str <- paste(conus$long, conus$lat, sep=" ", collapse = ",")
+wkt_str <- sprintf("POLYGON((%s))", wkt_coord_str)
+query_sb_spatial(bb_wkt = wkt_str)
+### Using `query_sb_date`
+`query_sb_date` returns ScienceBase items that fall within a certain time range. There are multiple timestamps applied to items, so you will need to specify which one to match the range. The default queries are to look for items last updated between 1970-01-01 and today's date. See `?query_sb_date` for more examples of which timestamps are available.
+```{r query_sb_date}
+# find data worked on in the last week
+today <- Sys.Date()
+oneweekago <- today - as.difftime(7, units='days') # days * hrs/day * secs/hr
+recent_data <- query_sb_date(start = today, end = oneweekago)
+sapply(recent_data, function(item) item$title)
+# find data that's been created over the last year
+oneyearago <- today - as.difftime(365, units='days') # days * hrs/day * secs/hr
+recent_data <- query_sb_date(start = today, end = oneyearago, date_type = "dateCreated")
+sapply(recent_data, function(item) item$title)
+### Using `query_sb_datatype`
+`query_sb_datatype` is used to search ScienceBase by the type of data an item is listed as. Run `sb_datatypes()` to get a list of 50 available data types.
+```{r query_sb_datatype}
+# get ScienceBase news items
+sbnews <- query_sb_datatype("News")
+sapply(sbnews, function(item) item$title)
+# find shapefiles
+shps <- query_sb_datatype("Shapefile")
+sapply(shps, function(item) item$title)
+# find raster data
+sbraster <- query_sb_datatype("Raster")
+sapply(sbraster, function(item) item$title)
+## Best of both methods
+Although you can query from R, sometimes it's useful to look at an item on the web interface. You can use the `query_sb_*` functions and then follow that URL to view items on the web. This is especially handy for viewing maps and metadata, or to check or repair a ScienceBase item if any of the `sbtools`-based commands have failed.
+```{r query_sb-website}
+sbmaps <- query_sb_datatype("Static Map Image", limit=3)
+oneitem <- sbmaps[[1]]
+# get and open URL from single sbitem
+url_oneitem <- oneitem[['link']][['url']]
+# get and open URLs from many sbitems
+lapply(sbmaps, function(sbitem) {
+ url <- sbitem[['link']][['url']]
+ browseURL(url)
+ return(url)
+### Using `query_sb`
+`query_sb` is the "catch-all" function for querying ScienceBase from R. It only takes one argument for specifying query parameters, `query_list`. This is an R list with specific query parameters as the list names and the user query string as the list values. See the `Description` section of the help file for all options (`?query_sb`).
+```{r query_sb-keywords}
+# search by keyword
+precip_data <- query_sb(query_list = list(q = 'precipitation'))
+length(precip_data) # 20 entries, but there is likely more than 20 results
+sapply(precip_data, function(item) item$title)
+# search by keyword, sort by last updated, and increase num results allowed
+precip_data_recent <- query_sb(query_list = list(q = 'precipitation',
+ sort = 'lastUpdated',
+ limit = 50))
+length(precip_data_recent) # 50 entries, but the search criteria is the same, just sorted
+sapply(precip_data_recent, function(item) item$title)
+# search by keyword + type
+# Used sb_datatype() to figure out what types were allowed for "browseType"
+precip_maps_data <- query_sb(query_list = list(q = 'precipitation', browseType = "Static Map Image", sort='title'))
+sapply(precip_maps_data, function(item) item$title)
+If you want to search by more than one keyword, you should use Lucene query syntax. Visit [this page](http://www.lucenetutorial.com/lucene-query-syntax.html) for information on Lucene queries. For instance, you can have results returned that include both "flood" and "earthquake", or either "flood" or "earthquake". Current functionality requires a regular query to be specified in order for `lq` to return results. So, just include `q = ''` when executing Lucene queries (this is a known [issue](https://github.com/USGS-R/sbtools/issues/236) in `sbtools`).
+```{r query_sb-lucene}
+# search by 2 keywords (AND)
+hazard2and_data <- query_sb(query_list = list(q = '', lq = 'flood AND earthquake'),
+ limit=200)
+# search by 2 keywords (OR)
+hazard2or_data <- query_sb(query_list = list(q = '', lq = 'flood OR earthquake'),
+ limit=200)
+# search by 3 keywords with grouped query
+hazard3_data <- query_sb(query_list =
+ list(q = '', lq = '(flood OR earthquake) AND drought'),
+ limit=200)
+## No results
+Some of your queries will probably return no results. When there are no results that match your query, the returned list will have a length of 0.
+```{r query_sb-empty}
+# search for items related to a Water Quality Portal paper DOI
+wqp_paper <- query_sb_doi(doi = '10.1002/2016WR019993')
+# spatial query in the middle of the Atlantic Ocean
+atlantic_ocean <- query_sb_spatial(long=28.790431, lat=-41.436485)
+# date query during Marco Polo's life
+marco_polo <- query_sb_date(start = as.Date("1254-09-15"),
+ end = as.Date("1324-01-08"))
+author: Lindsay R. Carr
+date: 9999-07-01
+slug: sbtools-discovery
+title: sbtools - Data discovery
+draft: true
+image: img/main/intro-icons-300px/r-logo.png
+ main:
+ parent: Introduction to USGS R Packages
+ weight: 2
+Although ScienceBase is a great platform for uploading and storing your data, you can also use it to find other available data. You can do that manually by searching using the ScienceBase web interface or through `sbtools` functions.
+Discovering data via web interface
+The most familiar way to search for data would be to use the ScienceBase search capabilities available online. You can search for any publically available data in the [ScienceBase catalog](https://www.sciencebase.gov/catalog/). Search by category (map, data, project, publication, etc), topic-based tags, or location; or search by your own key words.
+![ScienceBase Catalog Homepage](../static/img/sb_catalog_search.png#inline-img "search ScienceBase catalog")
+Learn more about the [catalog search features](www.sciencebase.gov/about/content/explore-sciencebase#2.%20Search%20ScienceBase) and explore the [advanced searching capabilities](www.sciencebase.gov/about/content/sciencebase-advanced-search) on the ScienceBase help pages.
+Discovering data via sbtools
+The ScienceBase search tools can be very powerful, but lack the ability to easily recreate the search. If you want to incorporate dataset queries into a reproducible workflow, you can script them using the `sbtools` query functions. The terminology differs from the web interface slightly. Below are functions available to query the catalog:
+1. `query_sb_text` (matches title or description)
+2. `query_sb_doi` (use a DOI identifier)
+3. `query_sb_spatial` (data within or at a specific location)
+4. `query_sb_date` (items within time range)
+5. `query_sb_datatype` (type of data, not necessarily file type)
+6. `query_sb` (generic SB query)
+These functions take a variety of inputs, and all return an R list of `sbitems` (a special `sbtools` class). All of these functions default to 20 returned search results, but you can change that by specifying the argument `limit`. The `query_sb` is a generalization of the other functions, and has a number of additional query specifications: [Lucene query string](http://www.lucenetutorial.com/lucene-query-syntax.html), folder and parent items, item ids, or project status. Before we practice using these functions, make sure you load the `sbtools` package in your current R session.
+``` r
+### Using `query_sb_text`
+`query_sb_text` returns a list of `sbitems` that match the title or description fields. Use it to search authors, station names, rivers, states, etc.
+``` r
+# search using a contributors name
+contrib_results <- query_sb_text("Robert Hirsch")
+head(contrib_results, 2)
+ ## [[1]]
+ ##
+ ## Title: Input data and results of WRTDS models and seasonal rank-sum tests to determine trends in the quality of water in New Jersey streams, water years 1971-2011
+ ## Creator/LastUpdatedBy: /
+ ## Provenance (Created / Updated): /
+ ## Children:
+ ## Item ID: 573e031ee4b02c61aaace7eb
+ ## Parent ID: 56df010ae4b015c306fc2af9
+ ##
+ ## [[2]]
+ ##
+ ## Title: Evaluation of stream nutrient trends in the Lake Erie drainage basin in the presence of changing patterns in climate, streamflow, land drainage, and agricultural practices
+ ## Creator/LastUpdatedBy: /
+ ## Provenance (Created / Updated): /
+ ## Children:
+ ## Item ID: 57bb0ffce4b03fd6b7dd03dd
+ ## Parent ID: 52e6a0a0e4b012954a1a238a
+``` r
+# search using place of interest
+park_results <- query_sb_text("Yellowstone")
+head(park_results, 2)
+ ## [[1]]
+ ##
+ ## Title: Spatial and temporal relations between fluvial and allacustrine Yellowstone cutthroat trout, Oncorhynchus clarki bouvieri, spawning in the Yellowstone River, outlet stream of Yellowstone Lake
+ ## Creator/LastUpdatedBy: /
+ ## Provenance (Created / Updated): /
+ ## Children:
+ ## Item ID: 50577ee7e4b01ad7e027f275
+ ## Parent ID: 5046602fe4b0241d49d62cab
+ ##
+ ## [[2]]
+ ##
+ ## Title: Conservation and Climate Adaptation Strategies for Yellowstone Cutthroat Trout
+ ## Creator/LastUpdatedBy: /
+ ## Provenance (Created / Updated): /
+ ## Children:
+ ## Item ID: 520039e8e4b0ad2d97189de0
+ ## Parent ID: 529e1574e4b0516126f68e8a
+``` r
+# search using a river
+river_results <- query_sb_text("Rio Grande")
+ ## [1] 20
+``` r
+head(river_results, 2)
+ ## [[1]]
+ ##
+ ## Title: Middle Rio Grande Multitemporal Land Cover Classifications - 1935, 1962, 1987, 1999, and 2014
+ ## Creator/LastUpdatedBy: /
+ ## Provenance (Created / Updated): /
+ ## Children:
+ ## Item ID: 58fe18eee4b0f87f0854ad3f
+ ## Parent ID: 5474ec49e4b04d7459a7eab2
+ ##
+ ## [[2]]
+ ##
+ ## Title: Water and Air Temperature Throughout the Range of Rio Grande Cutthroat Trout in Colorado and New Mexico; 2010-2015 V2
+ ## Creator/LastUpdatedBy: /
+ ## Provenance (Created / Updated): /
+ ## Children:
+ ## Item ID: 56d08559e4b015c306ee98c7
+ ## Parent ID: 5274215be4b097f32ac3f3d5
+It might be easier to look at the results returned from queries by just looking at their titles. The other information stored in an sbitem is useful, but a little distracting when you are looking at many results. You can use `sapply` to extract the titles.
+``` r
+# look at all titles returned from the site location query previously made
+sapply(river_results, function(item) item$title)
+ ## [1] "Middle Rio Grande Multitemporal Land Cover Classifications - 1935, 1962, 1987, 1999, and 2014"
+ ## [2] "Water and Air Temperature Throughout the Range of Rio Grande Cutthroat Trout in Colorado and New Mexico; 2010-2015 V2"
+ ## [3] "Upper Rio Grande"
+ ## [4] "Acoustic Doppler current profiler velocity data collected during 2015 and 2016 in the Calumet Harbor, Illinois"
+ ## [5] "Data for a Comprehensive Survey of Fault Zones, Breccias, and Fractures in and Flanking the Eastern Española Basin, Rio Grande Rift, New Mexico"
+ ## [6] "Magnetotelluric sounding locations, stations 1 to 22, Southern San Luis Valley, Colorado, 2006"
+ ## [7] "Notropis jemezanus (Rio Grande shiner)"
+ ## [8] "Etheostoma grahami (Rio Grande darter)"
+ ## [9] "Pseudemys gorzugi (Rio Grande Cooter)"
+ ## [10] "The Rio Grande, near Lost Trail Creek. Hinsdale County, Colorado. 1874. (Stereoscopic view)"
+ ## [11] "View of the Rio Grande near Pole Creek. Hinsdale County, Colorado. 1874. (Stereoscopic view)"
+ ## [12] "View on the Rio Grande, near Lost Trail Creek. Hinsdale County, Colorado. 1874. (Stereoscopic view)"
+ ## [13] "The Rio Grande, near Lost Trail Creek. Hinsdale County, Colorado. 1874. (Stereoscopic view)"
+ ## [14] "View of the Rio Grande, near Pole Creek. Hinsdale County, Colorado. 1874. (Stereoscopic view)"
+ ## [15] "Wagon Wheel Gap, Rio Grande River. Mineral County, Colorado. 1874. (Stereoscopic view)"
+ ## [16] "The Rio Grande Del Norte, below Wagon Wheel Gap. Mineral County, Colorado. 1874."
+ ## [17] "The Rio Grande Del Norte, below Wagon Wheel Gap. Mineral County, Colorado. 1874. (Stereoscopic view)"
+ ## [18] "View on the Rio Grande, near Lost Trail Creek. Hinsdale County, Colorado. 1874. (Stereoscopic view)"
+ ## [19] "Wagon Wheel Gap, Rio Grande River. Mineral County, Colorado. 1874. (Stereoscopic view)"
+ ## [20] "Wagon Wheel Gap, Rio Grande River. Mineral County, Colorado. 1874."
+Now you can use `sapply` to look at the titles for your returned searches instead of `head`.
+### Using `query_sb_doi`
+Use a Digital Object Identifier (DOI) to query ScienceBase. This should return only one list item, unless there is more than one ScienceBase item referencing this very unique identifier.
+``` r
+# USGS Microplastics study
+ ## [[1]]
+ ##
+ ## Title: Microplastics in 29 Great Lakes tributaries (2014-15)
+ ## Creator/LastUpdatedBy: /
+ ## Provenance (Created / Updated): /
+ ## Children:
+ ## Item ID: 5748a29be4b07e28b664dd62
+ ## Parent ID: 55a9170be4b0183d66e4667e
+``` r
+# Environmental Characteristics data
+ ## [[1]]
+ ##
+ ## Title: Selected Environmental Characteristics of Sampled Sites, Watersheds, and Riparian Zones for the U.S. Geological Survey Midwest Stream Quality Assessment
+ ## Creator/LastUpdatedBy: /
+ ## Provenance (Created / Updated): /
+ ## Children:
+ ## Item ID: 5714ec24e4b0ef3b7ca85d75
+ ## Parent ID: 569972c5e4b0ec051295ece5
+### Using `query_sb_spatial`
+`query_sb_spatial` accepts 3 different methods for specifying a spatial area in which to look for data. To illustrate the methods, we are going to use the spatial extents of the Appalachian Mountains and the Continental US.
+``` r
+appalachia <- data.frame(
+ lat = c(34.576900, 36.114974, 37.374456, 35.919619, 39.206481),
+ long = c(-84.771119, -83.393990, -81.256731, -81.492395, -78.417345))
+conus <- data.frame(
+ lat = c(49.078148, 47.575022, 32.914614, 25.000481),
+ long = c(-124.722111, -67.996898, -118.270335, -80.125804))
+# verifying where points are supposed to be
+points(conus$long, conus$lat, col="red", pch=20)
+points(appalachia$long, appalachia$lat, col="green", pch=20)
+The first way to query spatially is by specifying a bounding box `bbox` as an `sp` spatial data object. Visit the [`sp` package documentation](https://cran.r-project.org/web/packages/sp/vignettes/intro_sp.pdf) for more information on spatial data objects.
+``` r
+# query by bounding box
+ sp::SpatialPoints(appalachia, proj4string =
+ sp::CRS("+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")))
+ ## Loading required namespace: sp
+ ## list()
+Alternatively, you can supply a vector of latitudes and a vector of longitudes using `lat` and `long` arguments. The function will automatically use the minimum and maximum from those vectors to construct a boundary box.
+``` r
+# query by latitude and longitude vectors
+query_sb_spatial(long = appalachia$long, lat = appalachia$lat)
+ ## list()
+``` r
+query_sb_spatial(long = conus$long, lat = conus$lat)
+ ## list()
+The last way to represent a spatial region to query ScienceBase is using a POLYGON Well-known text (WKT) object as a text string. The format is `"POLYGON(([LONG1 LAT1], [LONG2 LAT2], [LONG3 LAT3]))"`, where `LONG#` and `LAT#` are longitude and latitude pairs as decimals. See the [Open Geospatial Consortium WKT standard](http://www.opengeospatial.org/standards/wkt-crs) for more information.
+``` r
+# query by WKT polygon
+wkt_coord_str <- paste(conus$long, conus$lat, sep=" ", collapse = ",")
+wkt_str <- sprintf("POLYGON((%s))", wkt_coord_str)
+query_sb_spatial(bb_wkt = wkt_str)
+ ## list()
+### Using `query_sb_date`
+`query_sb_date` returns ScienceBase items that fall within a certain time range. There are multiple timestamps applied to items, so you will need to specify which one to match the range. The default queries are to look for items last updated between 1970-01-01 and today's date. See `?query_sb_date` for more examples of which timestamps are available.
+``` r
+# find data worked on in the last week
+today <- Sys.Date()
+oneweekago <- today - as.difftime(7, units='days') # days * hrs/day * secs/hr
+recent_data <- query_sb_date(start = today, end = oneweekago)
+sapply(recent_data, function(item) item$title)
+ ## [1] "US Topo"
+ ## [2] "Collection of Field Photographs from Alaska"
+ ## [3] "National Elevation Dataset (NED) Alaska 2 arc-second"
+ ## [4] "Multi-stressor Predictive Models of Invertebrate Condition in the Corn Belt, U.S.A."
+ ## [5] "Topo Map Data"
+ ## [6] "Landscape Capability for Virginia Rail, Version 3.1, Northeast U.S."
+ ## [7] "Connectivity in WA: Products of the Washington Wildlife Habitat Connectivity Working Group"
+ ## [8] "National Transportation Dataset (NTD)"
+ ## [9] "Terrestrial Core and Connector Network, CT River Watershed"
+ ## [10] "USGS US Topo 7.5-minute map for Amesville, OH 2011"
+ ## [11] "USGS US Topo 7.5-minute map for Alfred, OH 2013"
+ ## [12] "USGS US Topo 7.5-minute map for Addison, OH-WV 2013"
+ ## [13] "USGS US Topo 7.5-minute map for Adamsville, OH 2010"
+ ## [14] "USGS US Topo 7.5-minute map for Akron East, OH 2010"
+ ## [15] "USGS US Topo 7.5-minute map for Alvordton, OH-MI 2013"
+ ## [16] "USGS US Topo 7.5-minute map for Albertville, WI 2010"
+ ## [17] "USGS US Topo 7.5-minute map for Bartlettsville, IN 2013"
+ ## [18] "USGS US Topo 7.5-minute map for Elmer, NJ 2011"
+ ## [19] "USGS US Topo 7.5-minute map for Briggsville, WI 2013"
+ ## [20] "USGS US Topo 7.5-minute map for Brillion, WI 2010"
+``` r
+# find data that's been created over the last year
+oneyearago <- today - as.difftime(365, units='days') # days * hrs/day * secs/hr
+recent_data <- query_sb_date(start = today, end = oneyearago, date_type = "dateCreated")
+sapply(recent_data, function(item) item$title)
+ ## [1] "USGS NED Original Product Resolution CA Sonoma 2013 bh soco 0074 TIFF 2017"
+ ## [2] "USGS NED Original Product Resolution CA Sonoma 2013 bh soco 0051 TIFF 2017"
+ ## [3] "USGS NED Original Product Resolution VA Eastern-ShoreBAA 2015 DEM S23 2679 10 TIFF 2017"
+ ## [4] "USGS NED Original Product Resolution VA Eastern-ShoreBAA 2015 DEM S23 2679 30 TIFF 2017"
+ ## [5] "USGS NED Original Product Resolution VA Eastern-ShoreBAA 2015 DEM S23 2679 40 TIFF 2017"
+ ## [6] "USGS NED Original Product Resolution VA Eastern-ShoreBAA 2015 DEM S23 2686 10 TIFF 2017"
+ ## [7] "USGS NED Original Product Resolution VA Eastern-ShoreBAA 2015 DEM S23 2666 40 TIFF 2017"
+ ## [8] "USGS NED Original Product Resolution VA Eastern-ShoreBAA 2015 DEM S23 2662 30 TIFF 2017"
+ ## [9] "USGS NED Original Product Resolution VA Eastern-ShoreBAA 2015 DEM S23 2660 10 TIFF 2017"
+ ## [10] "USGS NED Original Product Resolution VA Eastern-ShoreBAA 2015 DEM S23 2664 40 TIFF 2017"
+ ## [11] "USGS NED Original Product Resolution VA Eastern-ShoreBAA 2015 DEM S23 2658 30 TIFF 2017"
+ ## [12] "USGS NED Original Product Resolution VA Eastern-ShoreBAA 2015 DEM S23 2669 30 TIFF 2017"
+ ## [13] "USGS NED Original Product Resolution VA Eastern-ShoreBAA 2015 DEM S23 2795 10 TIFF 2017"
+ ## [14] "USGS NED Original Product Resolution VA Eastern-ShoreBAA 2015 DEM S23 2795 30 TIFF 2017"
+ ## [15] "USGS NED Original Product Resolution VA Eastern-ShoreBAA 2015 DEM S23 2795 40 TIFF 2017"
+ ## [16] "USGS NED Original Product Resolution VA Eastern-ShoreBAA 2015 DEM S23 2796 20 TIFF 2017"
+ ## [17] "USGS NED Original Product Resolution VA Eastern-ShoreBAA 2015 DEM S23 2862 40 TIFF 2017"
+ ## [18] "USGS NED Original Product Resolution VA Eastern-ShoreBAA 2015 DEM S23 2791 10 TIFF 2017"
+ ## [19] "USGS NED Original Product Resolution VA Eastern-ShoreBAA 2015 DEM S23 2785 40 TIFF 2017"
+ ## [20] "USGS NED Original Product Resolution VA Eastern-ShoreBAA 2015 DEM S23 2786 30 TIFF 2017"
+### Using `query_sb_datatype`
+`query_sb_datatype` is used to search ScienceBase by the type of data an item is listed as. Run `sb_datatypes()` to get a list of 50 available data types.
+``` r
+# get ScienceBase news items
+sbnews <- query_sb_datatype("News")
+sapply(sbnews, function(item) item$title)
+ ## [1] "Will Climate Change Hurt Tropical Rainforests? Scientists Study the Effects of Warming on Puerto Rican Forest"
+ ## [2] "Fire and Ice: Gauging the Effects of Wildfire on Alaskan Permafrost"
+ ## [3] "Sample Event Item"
+``` r
+# find shapefiles
+shps <- query_sb_datatype("Shapefile")
+sapply(shps, function(item) item$title)
+ ## [1] "Gulf of Alaska Digitization Project"
+ ## [2] "Estimated low-flow statistics at ungaged stream locations in New Jersey, water year 2016"
+ ## [3] "Airborne magnetic and radiometric survey, Ironton, Missouri area"
+ ## [4] "Gravity Change from 2014 to 2015, Sierra Vista Subwatershed, Upper San Pedro Basin, Arizona"
+ ## [5] "Airborne electromagnetic and magnetic survey data, East Poplar Oil Field and surrounding area, October 2014, Fort Peck Indian Reservation, Montana"
+ ## [6] "Magnetotelluric sounding locations, stations 1 to 22, Southern San Luis Valley, Colorado, 2006"
+ ## [7] "Cape Lookout, North Carolina 2012 National Wetlands Inventory Habitat Classification"
+ ## [8] "Inorganic and organic concentration data collected from 38 streams in the United States, 2012-2014, with supporting data, as part of the Chemical Mixtures and Environmental Effects Pilot Study."
+ ## [9] "Principal facts of gravity data in the southern San Luis Basin, northern New Mexico"
+ ## [10] "Airborne Geophysical Surveys over the Eastern Adirondacks, New York State"
+ ## [11] "Mean of the Top Ten Percent of NDVI Values in the Yuma Proving Ground during Monsoon Season, 1986-2011"
+ ## [12] "PaRadonGW.shp - Evaluation of Radon Occurrence in Groundwater from 16 Geologic Units in Pennsylvania, 19862015, with Application to Potential Radon Exposure from Groundwater and Indoor Air"
+ ## [13] "Boat-based water-surface elevation surveys along the upper Willamette River, Oregon, in March, 2015"
+ ## [14] "Sediment Texture and Geomorphology of the Sea Floor from Fenwick Island, Maryland to Fisherman's Island, Virginia"
+ ## [15] "Point locations for earthquakes M2.5 and greater in a two-year aftershock sequence resulting from the HayWired scenario earthquake mainshock (4/18/2018) in the San Francisco Bay area, California"
+ ## [16] "Data for a Comprehensive Survey of Fault Zones, Breccias, and Fractures in and Flanking the Eastern Española Basin, Rio Grande Rift, New Mexico"
+ ## [17] "Bathymetry and Capacity of Shawnee Reservoir, Oklahoma, 2016"
+ ## [18] "Direct-push sediment cores, resistivity profiles, and depth-averaged resistivity collected for Platte River Recovery and Implementation Program in Phelps County, Nebraska"
+ ## [19] "Carbonate geochemistry dataset of the soil and an underlying cave in the Ozark Plateaus, central United States"
+ ## [20] "Streamflow and fish community diversity data for use in developing ecological limit functions for the Cumberland Plateau, northeastern Middle Tennessee and southwestern Kentucky, 2016"
+``` r
+# find raster data
+sbraster <- query_sb_datatype("Raster")
+sapply(sbraster, function(item) item$title)
+ ## [1] "Modified Land Cover Raster for the Upper Oconee Watershed"
+ ## [2] "Digital elevation model of Little Holland Tract, Sacramento-San Joaquin Delta, California, 2015"
+ ## [3] "Seafloor character--Offshore Pigeon Point, California"
+ ## [4] "2010 UMRS Color Infrared Aerial Photo Mosaic - Illinois River, LaGrange Pool South"
+ ## [5] "2010 UMRS Color Infrared Aerial Photo Mosaic - Mississippi River, Pool 06"
+ ## [6] "Vegetation data for 1970-1999, 2035-2064, and 2070-2099 for 59 vegetation types"
+ ## [7] "Ecologically-relevant landforms for Southern Rockies LCC"
+ ## [8] "Multivariate Adaptive Constructed Analogs (MACA) CMIP5 Statistically Downscaled Data for Coterminous USA"
+ ## [9] "Backscatter [USGS07]--Offshore of Gaviota Map Area, California"
+ ## [10] "Sediment Thickness--Pigeon Point to Monterey, California"
+ ## [11] "Seafloor character--Offshore of Point Conception Map Area, California"
+ ## [12] "Trout Unlimited-Coldwater Fisheries Data"
+ ## [13] "Coal Mines"
+ ## [14] "Condition Index - Aquatic - Focal Species"
+ ## [15] "Vhg: terrestrially-defined vulnerability, biome velocity for Great Northern LCC"
+ ## [16] "Vtw: hydrologically-defined vulnerability, temperature change for Great Northern LCC"
+ ## [17] "North American vegetation model data for land-use planning in a changing climate:"
+ ## [18] "Bathymetry hillshade--Offshore of Point Conception Map Area, California"
+ ## [19] "Projected Future LOCA Statistical Downscaling (Localized Constructed Analogs) Statistically downscaled CMIP5 climate projections for North America"
+ ## [20] "Bathymetry Hillshade [2m]--Offshore of Monterey Map Area, California"
+Best of both methods
+Although you can query from R, sometimes it's useful to look at an item on the web interface. You can use the `query_sb_*` functions and then follow that URL to view items on the web. This is especially handy for viewing maps and metadata, or to check or repair a ScienceBase item if any of the `sbtools`-based commands have failed.
+``` r
+sbmaps <- query_sb_datatype("Static Map Image", limit=3)
+oneitem <- sbmaps[[1]]
+# get and open URL from single sbitem
+url_oneitem <- oneitem[['link']][['url']]
+# get and open URLs from many sbitems
+lapply(sbmaps, function(sbitem) {
+ url <- sbitem[['link']][['url']]
+ browseURL(url)
+ return(url)
+ ## [[1]]
+ ## [1] "https://www.sciencebase.gov/catalog/item/4f4e4813e4b07f02db4da961"
+ ##
+ ## [[2]]
+ ## [1] "https://www.sciencebase.gov/catalog/item/4f4e4884e4b07f02db51840f"
+ ##
+ ## [[3]]
+ ## [1] "https://www.sciencebase.gov/catalog/item/57b3b8e7e4b03bcb0103980a"
+### Using `query_sb`
+`query_sb` is the "catch-all" function for querying ScienceBase from R. It only takes one argument for specifying query parameters, `query_list`. This is an R list with specific query parameters as the list names and the user query string as the list values. See the `Description` section of the help file for all options (`?query_sb`).
+``` r
+# search by keyword
+precip_data <- query_sb(query_list = list(q = 'precipitation'))
+length(precip_data) # 20 entries, but there is likely more than 20 results
+ ## [1] 20
+``` r
+sapply(precip_data, function(item) item$title)
+ ## [1] "Change in Precipitation (Projected and Observed) and Change in Standard Precipitation For Emissions Scenarios A2, A1B and B1 for the Gulf of Mexico"
+ ## [2] "Precipitation as Snow (PAS)"
+ ## [3] "Precipitation"
+ ## [4] "Mean Summer (May to Sep) Precipitation (MSP)"
+ ## [5] "Summer (Jun to Aug) Precipitation (PPTSM)"
+ ## [6] "Isoscapes of d18O and d2H reveal climatic forcings on Alaska and Yukon precipitation"
+ ## [7] "Mean Annual Precipitation (MAP)"
+ ## [8] "Precipitation mm/year projections for years 2010-2080 RCP 8.5"
+ ## [9] "Isoscapes of d18O and d2H reveal climatic forcings on Alaska and Yukon precipitation"
+ ## [10] "Winter (Dec to Feb) Precipitation (PPTWT)"
+ ## [11] "Precipitation mm/year projections for years 2010-2080 RCP 4.5"
+ ## [12] "Isoscapes of d18O and d2H reveal climatic forcings on Alaska and Yukon precipitation"
+ ## [13] "Isoscapes of d18O and d2H reveal climatic forcings on Alaska and Yukon precipitation"
+ ## [14] "Isoscapes of d18O and d2H reveal climatic forcings on Alaska and Yukon precipitation"
+ ## [15] "Average, Standard and Projected Precipitation for Emissions Scenarios A2, A1B, and B1 for the Gulf of Mexico"
+ ## [16] "30 Year Mean Annual Precipitation 1960- 1990 PRISM"
+ ## [17] "Precipitation variability and primary productivity in water-limited ecosystems: how plants 'leverage' precipitation to 'finance' growth"
+ ## [18] "Climate change and precipitation - Consequences of more extreme precipitation regimes for terrestrial ecosystems"
+ ## [19] "A Numerical Study of the 1996 Saguenay Flood Cyclone: Effect of Assimilation of Precipitation Data on Quantitative Precipitation Forecasts"
+ ## [20] "A precipitation-runoff model for part of the Ninemile Creek Watershed near Camillus, Onondaga County, New York"
+``` r
+# search by keyword, sort by last updated, and increase num results allowed
+precip_data_recent <- query_sb(query_list = list(q = 'precipitation',
+ sort = 'lastUpdated',
+ limit = 50))
+length(precip_data_recent) # 50 entries, but the search criteria is the same, just sorted
+ ## [1] 20
+``` r
+sapply(precip_data_recent, function(item) item$title)
+ ## [1] "Developing an Agroforestry Dashboard for the Marshall Islands"
+ ## [2] "Measured and estimated monthly precipitation values for precipitation gages in the Black Hills area, South Dakota, water years 1931-98"
+ ## [3] "Polygons Representing Sensitivity of Ground Water to Contamination in Lawrence County, SD"
+ ## [4] "Arcs Representing Potential Streamflow-loss Zones in Lawrence County, SD"
+ ## [5] "Polygons Representing Drainage Areas Upstream from Potential Streamflow-loss Zones in Lawrence County, SD"
+ ## [6] "Saturation overland flow estimated by TOPMODEL for the conterminous United States"
+ ## [7] "GSFLOW model simulations used to evaluate the impact of irrigated agriculture on surface water - groundwater interaction"
+ ## [8] "A model for evaluating stream temperature response to climate change scenarios in Wisconsin"
+ ## [9] "Response of deep groundwater to land use change in desert basins of the Trans-Pecos region, Texas, USA: Effects on infiltration, recharge, and nitrogen fluxes"
+ ## [10] "Microbiological reduction of Sb(V) in anoxic freshwater sediments"
+ ## [11] "Region-wide ecological responses of arid Wyoming big sagebrush communities to fuel treatments"
+ ## [12] "Soil resources influence vegetation and response to fire and fire-surrogate treatments in sagebrush-steppe ecosystems"
+ ## [13] "Depletion and capture: revisiting The source of water derived from wells\""
+ ## [14] "Different historical fireclimate patterns in California"
+ ## [15] "Evaluation of downscaled General Circulation Model (GCM) output for current conditions and associated error in simulated runoff for CONUS"
+ ## [16] "Projected Hydrologic Changes Under Mid-21st Century Climatic Conditions in a Sub-arctic Watershed"
+ ## [17] "Basic principles of wind erosion control"
+ ## [18] "National Water Census Data Resources Portal"
+ ## [19] "Basin Characteristics of Streamgages in New Mexico and Adjacent States (2017)"
+ ## [20] "Are modern geothermal waters in northwest Nevada forming epithermal gold deposits?"
+``` r
+# search by keyword + type
+# Used sb_datatype() to figure out what types were allowed for "browseType"
+precip_maps_data <- query_sb(query_list = list(q = 'precipitation', browseType = "Static Map Image", sort='title'))
+sapply(precip_maps_data, function(item) item$title)
+ ## [1] "The Washington-British Columbia Transboundary Climate-Connectivity Project"
+If you want to search by more than one keyword, you should use Lucene query syntax. Visit [this page](http://www.lucenetutorial.com/lucene-query-syntax.html) for information on Lucene queries. For instance, you can have results returned that include both "flood" and "earthquake", or either "flood" or "earthquake". Current functionality requires a regular query to be specified in order for `lq` to return results. So, just include `q = ''` when executing Lucene queries (this is a known [issue](https://github.com/USGS-R/sbtools/issues/236) in `sbtools`).
+``` r
+# search by 2 keywords (AND)
+hazard2and_data <- query_sb(query_list = list(q = '', lq = 'flood AND earthquake'),
+ limit=200)
+ ## [1] 62
+``` r
+# search by 2 keywords (OR)
+hazard2or_data <- query_sb(query_list = list(q = '', lq = 'flood OR earthquake'),
+ limit=200)
+ ## [1] 200
+``` r
+# search by 3 keywords with grouped query
+hazard3_data <- query_sb(query_list =
+ list(q = '', lq = '(flood OR earthquake) AND drought'),
+ limit=200)
+ ## [1] 158
+No results
+Some of your queries will probably return no results. When there are no results that match your query, the returned list will have a length of 0.
+``` r
+# search for items related to a Water Quality Portal paper DOI
+wqp_paper <- query_sb_doi(doi = '10.1002/2016WR019993')
+ ## [1] 0
+``` r
+ ## list()
+``` r
+# spatial query in the middle of the Atlantic Ocean
+atlantic_ocean <- query_sb_spatial(long=28.790431, lat=-41.436485)
+ ## [1] 0
+``` r
+ ## list()
+``` r
+# date query during Marco Polo's life
+marco_polo <- query_sb_date(start = as.Date("1254-09-15"),
+ end = as.Date("1324-01-08"))
+ ## [1] 0
+``` r
+ ## list()
