Skip to content

Loading remote grid data

Joaquin Bedia edited this page Nov 20, 2017 · 5 revisions

Datasets provided through an OPeNDAP (remote data access protocols) service, can be remotely accessed and loaded. Here, we illustrate an example for the NCEP/NCAR Reanalysis 1 data, that is available trough the NOAA thredds data server.

Loading from a single file

We can load the data in a NetCDF file pointing to the corresponding URL, that can be found in the data server, for instance:

di <- dataInventory("http://www.esrl.noaa.gov/psd/thredds/dodsC/Datasets/ncep.reanalysis/surface/air.sig995.1948.nc")
air <- loadGridData("http://www.esrl.noaa.gov/psd/thredds/dodsC/Datasets/ncep.reanalysis/surface/air.sig995.1948.nc", 
                    var = "air")

## [2016-02-18 10:49:35] Doing inventory ...
## [2016-02-18 10:49:38] Retrieving info for 'air' (0 vars remaining)
## [2016-02-18 10:49:38] Done.
## [2016-02-18 10:49:38] Opening connection with remote server...
## [2016-02-18 10:49:40] Connected successfuly
## [2016-02-18 10:49:40] Defining geo-location parameters
## [2016-02-18 10:49:40] Defining time selection parameters
## [2016-02-18 10:49:40] Retrieving data subset ...
## [2016-02-18 10:49:59] Done

Loading from a dataset (.ncml)

In the previous section of the wiki (sec. 2.1. Dataset definition and loading local grid data) we provide a subset of the NCEP/NCAR Reanalysis 1 data together with the corresponding NcML file, which points to the local directory where the gridded data is stored in the form of NetCDF files. However, datasets provided through an OPeNDAP service, can be remotely accessed and loaded with the corresponding NcML file, avoiding to store heavy and unnecessary data locally. When available, the remote path to the NcML file is specified for loading. Otherwise, remote data can be accessed with a NcML that is stored locally. This is the case of the NCEP remote dataset, therefore, we provide the needed ncml file, that is available for download:

download.file("http://meteo.unican.es/work/loadeR/data/OPeNDAP_NCEP/ncepReanalysis1_4xDaily.ncml", 
              destfile = "mydirectory/ncep.ncml")

ncep.remote <- "mydirectory/ncep.ncml"

This ncml points to the NOAA thredds data server

#To open the ncml type 
system("cat mydirectory/ncep.ncml")

There are a lot of variables available for loading:

di <- dataInventory(ncep.remote)

## [2016-02-18 10:49:59] Doing inventory ...
## [2016-02-18 10:50:22] Retrieving info for 'tas' (13 vars remaining)
## [2016-02-18 10:50:22] Retrieving info for 'nlwrs' (12 vars remaining)
## [2016-02-18 10:50:22] Retrieving info for 'nswrs' (11 vars remaining)
## [2016-02-18 10:50:22] Retrieving info for 'prate' (10 vars remaining)
## [2016-02-18 10:50:22] Retrieving info for 'tmax' (9 vars remaining)
## [2016-02-18 10:50:22] Retrieving info for 'tmin' (8 vars remaining)
## [2016-02-18 10:50:22] Retrieving info for 'uas' (7 vars remaining)
## [2016-02-18 10:50:22] Retrieving info for 'vas' (6 vars remaining)
## [2016-02-18 10:50:22] Retrieving info for 'air' (5 vars remaining)
## [2016-02-18 10:50:53] Retrieving info for 'hgt' (4 vars remaining)
## [2016-02-18 10:51:24] Retrieving info for 'shum' (3 vars remaining)
## [2016-02-18 10:51:39] Retrieving info for 'uwnd' (2 vars remaining)
## [2016-02-18 10:52:10] Retrieving info for 'vwnd' (1 vars remaining)
## [2016-02-18 10:52:41] Retrieving info for 'slp' (0 vars remaining)
## [2016-02-18 10:52:41] Done.

names(di)

##  [1] "tas"   "nlwrs" "nswrs" "prate" "tmax"  "tmin"  "uas"   "vas"  
##  [9] "air"   "hgt"   "shum"  "uwnd"  "vwnd"  "slp"

Here, we are loading "tmax" (maximum temperature) of June-1991:

#loading 
tx <- loadGridData(dataset = ncep.remote, 
                   var = "tmax",
                   season = 6,
                   years = 1991)

## [2016-02-18 10:53:04] Defining geo-location parameters
## [2016-02-18 10:53:05] Defining time selection parameters
## [2016-02-18 10:53:05] Retrieving data subset ...
## [2016-02-18 10:53:17] Done

If we display the structure, we will see that the loaded data is 6-hourly.

str(tx)

## List of 4
##  $ Variable:List of 2
##   ..$ varName: chr "tmax"
##   ..$ level  : NULL
##   ..- attr(*, "use_dictionary")= logi FALSE
##   ..- attr(*, "description")= chr "4xDaily Maximum Temperature at 2 m"
##   ..- attr(*, "units")= chr "degK"
##   ..- attr(*, "longname")= chr "tmax"
##   ..- attr(*, "daily_agg_cellfun")= chr "none"
##   ..- attr(*, "monthly_agg_cellfun")= chr "none"
##   ..- attr(*, "verification_time")= chr "none"
##  $ Data    : num [1:120, 1:94, 1:193] 209 211 212 213 213 ...
##   ..- attr(*, "dimensions")= chr [1:3] "time" "lat" "lon"
##  $ xyCoords:List of 2
##   ..$ x: num [1:193] -178 -176 -174 -172 -171 ...
##   ..$ y: num [1:94] -88.5 -86.7 -84.8 -82.9 -80.9 ...
##   ..- attr(*, "projection")= chr "LatLonProjection"
##  $ Dates   :List of 2
##   ..$ start: chr [1:120] "1991-06-01 00:00:00 GMT" "1991-06-01 06:00:00 GMT" "1991-06-01 12:00:00 GMT" "1991-06-01 18:00:00 GMT" ...
##   ..$ end  : chr [1:120] "1991-06-01 00:00:00 GMT" "1991-06-01 06:00:00 GMT" "1991-06-01 12:00:00 GMT" "1991-06-01 18:00:00 GMT" ...
##  - attr(*, "dataset")= chr "/oceano/gmeteo/WORK/WWW/loadeR/data/OPeNDAP_NCEP/ncepReanalysis1_4xDaily.ncml"

We can apply an aggregation function (e.g. aggr.d = "mean") to obtain daily or monthly (aggr.m) data (a "NOTE" is returned with the aggregation details):

tx <- loadGridData(dataset = ncep.remote,
                var = "tmax",
                season = 6,
                years = 1991,
                time = "DD",
                aggr.d = "mean")

## [2016-02-18 10:53:40] Defining geo-location parameters
## [2016-02-18 10:53:40] Defining time selection parameters
## NOTE: Daily aggregation will be computed from 6-hourly data
## [2016-02-18 10:53:40] Retrieving data subset ...
## [2016-02-18 10:53:57] Done

Other available remote model-datasets

The User Data Gateway

In addition to the load options described above, the NCEP/NCAR Reanalysis 1 data (and other datasets) can be also accessed and loaded remotely through the User Data Gateway (UDG), which is the one stop shop for climate data access maintained by the Santander MetGroup, providing an easy an homogeneus way to load data from different datasets. Go to section 2.3. Loading data from the User Data Gateway (UDG) for more information of the available datasets and a worked example.

EOBS

The EOBS database can be remotely accessed via OPeNDAP using loadeR. The different datasets are available can be accessed here

Example:

library(loadeR)
# Precipitation dataset URL
ds <- "http://opendap.knmi.nl/knmi/thredds/dodsC/e-obs_0.25regular/rr_0.25deg_reg_v12.0.nc"
# Load daily spring precipitation (MAM) for the period 1991-2000 over the Iberian Peninsula:
precip.MAM <- loadGridData(dataset = ds,
			   var = "rr",
			   lonLim = c(-10,3),
			   latLim = c(36,44),
			   years = 1991:2000,
			   season = 3:5)
## [2016-02-24 16:42:35] Opening connection with remote server...
## [2016-02-24 16:42:36] Connected successfuly
## [2016-02-24 16:42:36] Defining geo-location parameters
## [2016-02-24 16:42:36] Defining time selection parameters
## [2016-02-24 16:42:36] Retrieving data subset ...
## [2016-02-24 16:42:43] Done

The NASA NEX-GDDP Data Server

It provides open online access to different CMIP5 downscaled climate scenarios for the entire globe from the The NASA Earth Exchange Global Daily Downscaled Projections (NEX-GDDP), directly accessible using loadeR through the catalogs exposed at the Project's THREDDS data server.

See the Examples section in the loadGridData help documentation for an example on how to access it.

help("loadeR::loadGridData")

<-- Home page of the Wiki