Skip to content

Commit

Permalink
Modify to represent NCDB
Browse files Browse the repository at this point in the history
  • Loading branch information
Edwards, Paul committed Sep 25, 2024
1 parent a213a4f commit 93ab4ef
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions source/docs/solar/ncdb/guide.html.md.erb
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ summary: Tips and tricks for getting the most out of the API

## Data Download Usage Guide

We often receive requests from users interested in downloading larger segments of the data than are supported by the API. It is important to understand that these datasets are large. Currently our storage archive contains hundreds of terabytes of data and is constantly growing! In order to reliably serve dynamic segments of a dataset of this size to a growing community of users we have calculated the maximum capacity our server hardware can sustain. We use this physical capacity to determine our API rate limits. For users insterested in accessing bulk data the full datasets are available via the Registry of Open Data on AWS at [https://registry.opendata.aws/nrel-pds-nsrdb/](https://registry.opendata.aws/nrel-pds-nsrdb/).
We often receive requests from users interested in downloading larger segments of the data than are supported by the API. It is important to understand that these datasets are large. Currently our storage archive contains hundreds of terabytes of data and is constantly growing! In order to reliably serve dynamic segments of a dataset of this size to a growing community of users we have calculated the maximum capacity our server hardware can sustain. We use this physical capacity to determine our API rate limits. For users insterested in accessing bulk data the full datasets are available via the Registry of Open Data on AWS at [https://registry.opendata.aws/nrel-pds-ncdb/](https://registry.opendata.aws/nrel-pds-ncdb/).

The API is restricted in several ways including the number of simultaneous requests a single user can make, the number of requests a single user can make in a 24 hour period, as well as in the maximum size of a single request. The API rate limits are set at:

Expand All @@ -31,7 +31,7 @@ The size limit per each single request is determined by the number of total attr

To maximize a single request simply minimize the variables wherever possible. For example by requesting half as many attributes one can request twice as many years (or sites) worth of data.

Imagine a use case where one wanted to download all PSM data for the state of Texas. The first step would be to refine the request down to the least number of attributes, years, and intervals possible. From that one could determine the number of sites that could be requested in a single request. By experimenting with the site_count endpoint a polygon size could be identified that intersects close to that many sites. Using that polygon size create a grid that covers all of Texas. The final step would be to write a script that invokes the API once for every required grid cell at a rate of no more than 1 every 2 seconds and no more than 20 in process at a time. If more than 1000 requests are required the script will have to be continued across multiple days.
Imagine a use case where one wanted to download all Climate data for the state of Texas. The first step would be to refine the request down to the least number of attributes, years, and intervals possible. From that one could determine the number of sites that could be requested in a single request. By experimenting with the site_count endpoint a polygon size could be identified that intersects close to that many sites. Using that polygon size create a grid that covers all of Texas. The final step would be to write a script that invokes the API once for every required grid cell at a rate of no more than 1 every 2 seconds and no more than 20 in process at a time. If more than 1000 requests are required the script will have to be continued across multiple days.


In cases where a very large WKT value is required, e.g. downloading the maximum number of broadly spaced points at a time using a MULTIPOINT wkt, it is possible to POST a request to the API providing the WKT params in the payload. Here is an example of a Python script that uses this variation:
Expand Down

0 comments on commit 93ab4ef

Please sign in to comment.