-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathWQP_WebSvcs_Guide.Rmd
211 lines (135 loc) · 46.1 KB
/
WQP_WebSvcs_Guide.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
---
title: "WQP Web Services Guide"
author: "USGS"
date: "9/1/2021"
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
* [Introduction](#introduction)
* [What Are Web Services?](#what-are-web-services?)
* [Accessing Data Programmatically Using Web Services](#accessing-data-programmatically-using-web-services)
* [How to Generate Web Services Requests](#how-to-generate-web-services-requests)
* [Using Open Geospatial Consortium (OGC) Services to map Water Quality Portal sites based on Water Quality Portal Parameters](#using-open-geospatial-consortium-services-to-map-water-quality-portal-sites-based-on-water-quality-portal-parameters)
## Introduction
The Water Quality Data Portal (WQP) provides easy access to data stored in three large water quality databases (WQX, NWIS, STEWARDS) through a web-based form interface as well as standalone web services. Both the form interface and the web services use the same input parameters (filters) and produce the same output formats. The web service enables programmatic access to WQP data and metadata without manually interacting with the form interface.
You can use the WQP web services to quickly and easily access data and metadata available on the Water Quality Portal. URL queries are constructed in a standard format and outputs are delivered in JSON format.
For more information on the WQP input parameters (filters) and data downloads, see the WQP [User Guide](http://www.waterqualitydata.us/portal_userguide.jsp).
## What Are Web Services?
APIs (Application Programming Interface) and Web Services are tools that enable communication between two networked devices or pieces of software using standardized methods. They are implemented in almost all of our mobile device applications that we use on a daily basis. They allow data from one system to be easily used by a second system without requiring the second system to locally store the data, which is especially beneficial when we are dealing with *big* data sets. In fact, common software packages including Excel and R are starting to come pre-packaged with the ability to use these services.
While APIs and Web Services provide similar functions, the types of communication they allow differs:
* An API allows two applications to communicate by creating shared rules and conventions. These can be used with a network connection, but there are also APIs that do not involve a network connection.
* A web service is a *type* of API that allows one computer to communicate data to another computer in a standardized way. While all web services are APIs, not all API's are web services.
The WQP web services are implemented using the 'https' protocol using REST, which provides a flexible and scalable approach for constructing standardized URL statements.
Click [here](https://github.com/project-open-data/project-open-data.github.io/blob/master/api-basics.md) to learn more about the basics of APIs, or check out this [Github page](https://github.com/18F/API-All-the-X/tree/master/pages) for details about how APIs are implemented within the USGS and across the federal government.
### Accessing Data Programmatically Through Web Services
Web services provide a public resource which users can also enhance with their own custom scripts.
One resource created by the USGS is an R-based software package called *dataRetrieval*, which makes it easier for R users to use the WQP web services. *dataRetrieval* does the heavy lifting to download data and convert it into a familiar and usable format. *dataRetrevial* can download data from the WQP and from a number of USGS NWIS services. To learn more about *dataRetrieval*, please check out these resources:
* [CRAN](https://cran.r-project.org/web/packages/dataRetrieval/index.html) - Download current release
* [GitHub](https://github.com/USGS-R/dataRetrieval) - Download most up-to-date source code (may contain bugs)
* [Tutorial](https://waterdata.usgs.gov/blog/dataretrieval/) - Learn how to use *dataRetrieval*
## How to Generate Web Services Requests
You can retrieve the *same* data using Web Services as you can with the WQP User Interface (advanced form). Query basics are outlined in the following series of tables. Most non-alphanumeric characters (such as punctuation) must be ["**url-encoded**"](https://www.tutorialspoint.com/html/html_url_encoding.htm), (*for example*: space is "%20").
Constructing a Request:
Every request starts with a base URL: the base URL will vary depending on the type of information requested:
* Base URL for downloading site data and associated metadata: `https://www.waterqualitydata.us/data/Station/search?`
* Base URL for downloading results: `https://www.waterqualitydata.us/data/Result/search?`
* Base URL for downloading activity data: `https://www.waterqualitydata.us/data/Activity/search?`
* Base URL for downloading activity metric data: `https://www.waterqualitydata.us/data/ActivityMetric/search?`
Construct a query by concatenating the base URL with the desired parameters and arguments,as shown in ***Table 1***. At least one parameter-argument pair must be specified. Separate multiple parameter-argument pairs with an *ampersand* ("&"). For downloads, if no file format (*mime type*) is specified, the retrieval will default to **WQX-XML format**. See the [WQP User Guide](http://www.waterqualitydata.us/portal_userguide.jsp) for a list of elements included in the result retrievals.
***Table 1.* URL-encoded retrieval parameters and arguments for WQP web services (parameter names are case-insensitive only to the leading capital letter)**
<details>
<summary>Expand Table</summary>
| REST parameter | Argument | Discussion |
|:-------------------:|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------:|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------:|
| bBox | *Western-most longitude, Southern-most latitude, Eastern-most longitude, and Northern-most longitudeseparated by commas, expressed in decimal degrees, WGS84, and longitudes west of Greenwich are negative. (Example: bBox=-92.8,44.2,-88.9,46.0)* | These four arguments are used together to form a quadrant of the Earth's surface for locating data-collection stations. Many stations outside the continental US do not have latitude and longitude referenced to WGS84 and therefore cannot be found using these parameters. Other stations are not associated with latitude and longitude due to Homeland Security concerns. | |
| lat | *Latitude for radial search, expressed in decimal degrees, WGS84* | These three arguments are used together to form a circle on the Earth's surface for locating data-collection stations. Many stations outside the continental US do not have latitude and longitude referenced to WGS84 and therefore cannot be found using these parameters. | |
| long | *Longitude for radial search, expressed in decimal degrees, WGS84* | | |
| within | *Distance for radial search, expressed in decimal miles* | | |
| countrycode | *Two-character Federal Information Processing Standard (FIPS) country code ([allowable values](https://www.waterqualitydata.us/portal_userguide/#WQPUserGuide-Domain_Value)).* | FIPS country codes were established by the [National Institute of Standards, publication 10-4](https://nvlpubs.nist.gov/nistpubs/Legacy/FIPS/fipspub10-4.pdf). | |
| statecode | *Two-character Federal Information Processing Standard (FIPS) country code, followed by a URL-encoded colon ("%3A"), followed by a two-digit FIPS state code. ([allowable values](https://www.waterqualitydata.us/portal_userguide/#WQPUserGuide-Domain_Value)).* | FIPS state codes were established by the [National Institute of Standards, publication 5-2](https://nvlpubs.nist.gov/nistpubs/Legacy/FIPS/fipspub5-2.pdf). | |
| countycode | *Two-character Federal Information Processing Standard (FIPS) country code, followed by a URL-encoded colon ("%3A"), followed by a two-digit FIPS state code, followed by a URL-encoded colon ("%3A"), followed by a three-digit FIPS county code. ([allowable values](https://www.waterqualitydata.us/portal_userguide/#WQPUserGuide-Domain_Value)).* | FIPS county codes were established by the [National Institute of Standards, publication 6-4](https://nvlpubs.nist.gov/nistpubs/Legacy/FIPS/fipspub6-4.pdf). | |
| siteType | *One or more case-sensitive site types, separated by semicolons ([allowable values](https://www.waterqualitydata.us/portal_userguide/#WQPUserGuide-Domain_Value)).* | Restrict retrieval to stations with specified site type (location in the hydrologic cycle). The MonitoringLocationTypeName for individual records may provide more detailed information about the type of individual stations. | |
| organization | *For USGS organization IDs, append an upper-case postal-service state abbreviation to "USGS-" to identify the USGS office managing the data collection station records. However, a few US states are serviced by one USGS office ([allowable values](https://www.waterqualitydata.us/portal_userguide/#WQPUserGuide-Domain_Value)).* (**USGS-MA** = Massachusetts and Rhode Island, **USGS-MD** = Maryland, Delaware, and the District of Columbia, **USGS-PR** = Caribbean Islands, **USGS-HI** = Pacific Islands). | USGS offices sometimes provide data for stations outside the political boundaries associated with the office's organization code. Use the statecode or countycode arguments to search for stations located within those political boundaries. | |
| siteid | *Concatenate an agency code, a hyphen ("-"), and a site-identification number.* | Each data collection station is assigned a unique site-identification number. Other agencies often use different site identification numbers for the same stations. | |
| huc | *One or more eight-digit hydrologic units, delimited by semicolons.* | Hydrologic unit codes identify surface watersheds. The [lists and maps of hydrologic units](http://water.usgs.gov/GIS/huc.html) are available from USGS. | |
| sampleMedia | *One or more case-sensitive sample media, separated by semicolons ([allowable values](https://www.waterqualitydata.us/portal_userguide/#WQPUserGuide-Domain_Value)).* | Sample media are broad general classes, and may be subdivided in the retrieved data. Examine the data elements ActivityMediaName, ActivityMediaSubdivisionName, and ResultSampleFractionText for more detailed information. | |
| characteristicType | *One or more case-sensitive characteristic types (groupings) separated by semicolons ([allowable values](https://www.waterqualitydata.us/portal_userguide/#WQPUserGuide-Domain_Value)).* | These groups will be expanded as part of the ongoing collaboration between USGS and USEPA. | |
| characteristicName | *One or more case-sensitive characteristic names, separated by semicolons ([allowable values](https://www.waterqualitydata.us/portal_userguide/#WQPUserGuide-Domain_Value)).* | Characteristic names identify different types of environmental measurements. The names are derived from the USEPA [Substance Registry System](http://iaspub.epa.gov/sor_internet/registry/substreg/home/overview/home.do) (SRS). USGS uses parameter codes for the same purpose and has [associated most parameters](http://www.waterqualitydata.us/public_srsnames.jsp) to SRS names. | |
| pCode | *One or more five-digit USGS parameter codes, separated by semicolons.* | | |
| activityId | *One or more case-sensitive activity IDs, separated by semicolons.* | Designator that uniquely identifies an activity within an organization. | |
| startDateLo | *Date of earliest desired data-collection activity, expressed as MM-DD-YYYY* | These two parameters, used together or individually, restrict the retrieval to data-collection activities within a range of dates. | |
| startDateHi | *Date of last desired data-collection activity, expressed as MM-DD-YYYY* | | |
| mimeType | *xml* | Output format is XML compatible with WQX-Outbound schema. This is the default format, and if a mimeType is not specified, the data will be in XML format. | |
| | *xlsx* | Output format is xlsx compatible with MS-Excel 2007 and greater. | |
| | *csv* | Output format is comma-separated columns. | |
| | *tsv\|tab* | Output format is tab-separated columns. | |
| | *geojson* | Output format is GeoJSON (JavaScript Object Notation). | |
| | *kml* | Output format is KML compatible with Google Earth. This option is not available for the results service. | |
| | *kmz* | Output format is kmz, a compressed form of kml compatible with Google Earth. This option is not available for the results service. | |
| Zip | *yes* | Include the parameter to stream compressed data. Compression often greatly increases throughput, thus expediting the request. Kml files will be returned in the kml-specific zip format, .kmz. | |
| providers | *EPA\|NWIS\|STEWARDS ([allowable values](https://www.waterqualitydata.us/portal_userguide/#WQPUserGuide-Domain_Value)).* | By default, requests are submitted to all the data providers. However, a particular provider may be specified using this parameter. | |
| sorted | *yes\|no* | By default, tabular data are sorted by organization, monitoringLocationID, and (for results) activityID. However, sorting increases response time significantly, sometimes by orders of magnitude. If you are doing your own sorting after download, set sorted=no. For large downloads (over 5 million rows) sorting is disabled by default to ensure reasonable response times. XML requests are always sorted to accommodate the WQX data schema. | |
| dataProfile | *biological* | Only affects results endpoint at this time. The biological dataProfile returns an extended set of columns that further describe biological data. | |
</details>
### Example Web Service Requests
To try out the following examples of REST web service requests, copy and paste into a web browser.
>*Example:* REST web service request to retrieve sites from *Oklahoma County, Oklahoma*, where *Atrazine* was measured in *XML* format, *zipped*:
`https://www.waterqualitydata.us/data/Station/search?countycode=US%3A40%3A109&characteristicName=Atrazine&mimeType=xml&zip=yes`
>*Example:* REST web service request to retrieve sites contained within a *bounding box* where *Caffeine* was measured in *KML* format, *zipped*:
`https://www.waterqualitydata.us/data/Station/search?characteristicName=Caffeine&mimeType=kml&bBox=-92.8,44.2,-88.9,46.0&zip=yes`
>*Example:* REST web service request to retrieve stream sites located in the state of Wyoming (USA) where parameters of the characteristicType *Nutrient* were measured, in *GeoJSON* format:
`https://www.waterqualitydata.us/data/Station/search?countrycode=US&statecode=US%3A56&siteType=Stream&characteristicType=Nutrient&mimeType=geojson`
>*Example:* REST web service request to retrieve *Caffeine* sample results from sites contained within a *bounding box* and *collected on, or after, 10-01-2006* in *MS-Excel* format, *zipped*:
`https://www.waterqualitydata.us/data/Result/search?characteristicName=Caffeine&bBox=-92.8,44.2,-88.9,46.0&startDateLo=10-01-2006&mimeType=xlsx&zip=yes`
### Queries Using POST
In certain situations when very long requests are submitted - for example, when querying data for a large list of sites - GET requests can fail due to limits in the allowable length of the request. These requests can instead be completed using http POST. POST requests should be submitted with a JSON-formatted payload.
Here's an example *JSON* object, including an organization ID and three site ID's:
```
{
"organization": ["WIDNR_WQX"],
"siteid": ["WIDNR_WQX-133003", "WIDNR_WQX-133398", "WIDNR_WQX-133486"]
}
```
An example using *curl* with these data looks like this:
```
curl -X POST --header 'Content-Type: application/json' --header 'Accept: application/zip' -d '{"organization":["WIDNR_WQX"],"siteid":["WIDNR_WQX-133003","WIDNR_WQX-133398","WIDNR_WQX-133486"]} \
' 'https://www.waterqualitydata.us/data/Station/search?mimeType=csv&zip=yes'
```
The --header argument lists metadata for the web service and ensures that the correct information is received and returned. If there are any problems with the request, they are returned in the response via the warning header(s). See [RFC 2616 Section 14](http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.46) for the format of the warning header.
### Using Open Geospatial Consortium Services to map Water Quality Portal sites based on Water Quality Portal Parameters
The Water Quality Portal has an endpoint for generating OGC-Compliant Geospatial web services. At this time, [Web Mapping Service (WMS)](http://docs.geoserver.org/stable/en/user/services/wms/reference.html) version 1.1.1 and [Web Feature Service (WFS)](http://docs.geoserver.org/stable/en/user/services/wfs/reference.html) version 1.1.1 are supported. The service is a customized version of the WMS and WFS services provided by [Geoserver](http://www.geoserver.org/). To get started, check out the Geoserver WMS Reference and WFS Reference. The base URL for both the WMS and WFS services is:
`https://www.waterqualitydata.us/ogcservices/{wms|wfs}`
Use web mapping tools, such as *Leaflet* or *Openlayers*, to easily connect to these web map services. GIS tools such as *ArcGIS* and *QGIS* also connect to these web maps services.
WFS and WMS services require an additional parameter, called "SearchParams", which is a URL-encoded version of the WQP search parameters. The service supports calls up to up to 250,000 sites. Calls that return large numbers of sites will take longer to load for the first time, as a cache is populated; subsequent responses will be much quicker. The cache is cleared after new data are loaded to the portal (once a day, at night).
At this time, we support WMS GetMap, WMS GetFeatureInfo, and WMS GetLegendGraphic.
**WMS Getmap**
* For detailed information on constructing a WMS getmap request, refer to the [GetMap documentation at the Geoserver site](http://docs.geoserver.org/stable/en/user/services/wms/reference.html#getmap).
* Here is an example GetMap Request for stream sites that have samples for *atrazine*:
`https://www.waterqualitydata.us/ogcservices/wms?SERVICE=WMS&REQUEST=GetMap&VERSION=1.1.1&LAYERS=wqp_sites&STYLES=wqp_sources&FORMAT=image%2Fpng&TRANSPARENT=true&HEIGHT=256&WIDTH=256&SEARCHPARAMS=countrycode%3AUS%3BcharacteristicName%3AAtrazine&SRS=EPSG%3A3857&BBOX=-15028131.257091932,-7.081154551613622e-10,-10018754.171394622,5009377.085697313`
### **Looking Up Domain Values Through Web Services**
The allowable values for each query variable (referred to as *domain values*) may change over time as new values are added. There is a web service request that allows you to look up which values are allowed for a query parameter.
A list of parameter names and arguments to use for these web service requests is shown in ***Table 2***.
Base URL for looking up domain values: `https://www.waterqualitydata.us/Codes/{endpointName}?{parameter}`
You must provide at least one argument in the web service call. If you want all domain values, you can just specify the *mimetype* (e.g. mimeType=json).
***Table 2.* Domain values web service parameters and arguments**
<details>
<summary>Expand Table</summary>
| {endpointName} | REST parameter | Argument | Discussion | Example |
|:-----------------------------------------------------------------------------:|:--------------:|:---------------------------------------------:|:------------------------------------------------------------------------------------------------------------------------------------------------------:|:---------------------------------------------------------------------------------------------------:|
| Common parameters for all domain values web services | mimeType | xml\|json | returns either XML or json. Default is xml | [https://waterqualitydata.us/Codes/characteristicname?text=ph&pagesize=20&pagenumber=1&mimeType=json](https://waterqualitydata.us/Codes/characteristicname?text=ph&pagesize=20&pagenumber=1&mimeType=json) | |
| | pagenumber | page number (1,2 etc) | allows for results to be paginated (especially useful for endpoints with many valid responses, allows for infinite scrolling). Use along with pagesize | | |
| | pagesize | e.g. 20 | number of results to return per page | | |
| | text | e.g. ph | text to match to endpoint results. This is straight string matching. When the text parameter is used, the results are returned sorted by length | | |
| **Endpoints with unique query parameters in addition to common query parameters** | | | | | |
| countrycode | | | FIPS country codes | | |
| statecode | countrycode | A FIPS country code (e.g. US) | FIPS state codes. A FIPS country code argument is appended so that the URL ends as /statecode?countrycode=US | [https://www.waterqualitydata.us/Codes/statecode?countrycode=US](https://www.waterqualitydata.us/Codes/statecode?countrycode=US) | |
| countycode | statecode | A FIPS statecode (e.g. statecode=US:01;US:04) | FIPS county codes. A FIPS statecode argument is appended so that the URL ends as /countycode?statecode=US:01;US:04 | [https://www.waterqualitydata.us/Codes/countycode?statecode=US:01;US:04](https://www.waterqualitydata.us/Codes/countycode?statecode=US:01;US:04) | |
| Sitetype | | | Available site types | [https://www.waterqualitydata.us/Codes/Sitetype?mimeType=json](https://www.waterqualitydata.us/Codes/Sitetype?mimeType=json) | |
| Organization | | | Available organization IDs | [https://www.waterqualitydata.us/Codes/Organization?mimeType=xml](https://www.waterqualitydata.us/Codes/Organization?mimeType=xml) | |
| Samplemedia | | | Sample media | [https://www.waterqualitydata.us/Codes/Samplemedia?mimeType=xml](https://www.waterqualitydata.us/Codes/Samplemedia?mimeType=xml) | |
| Characteristictype | | | Characteristic types (groups) | [https://www.waterqualitydata.us/Codes/Characteristictype?mimeType=xml](https://www.waterqualitydata.us/Codes/Characteristictype?mimeType=xml) | |
| Characteristicname | | | Characteristic names. A good choice for using paginated results so that hundreds of results are not returned | [https://www.waterqualitydata.us/Codes/Characteristicname?mimeType=xml](https://www.waterqualitydata.us/Codes/Characteristicname?mimeType=xml) | |
| providers | | | The names of the Data Sources for the Water Quality Portal | [https://www.waterqualitydata.us/Codes/providers?mimeType=xml](https://www.waterqualitydata.us/Codes/providers?mimeType=xml) | |
</details>