Skip to content

Application to Seasonal Forecasts (PP)

miturbide edited this page Oct 6, 2016 · 1 revision

In this worked example, we will perform analog downscaling of the System4 seasonal forecast of summer (JJA) maximum surface temperature on six weather stations of the GSN network in Spain. NCEP reanalysis data will be used for training the model and then we will perform the test on the System4 model of 15 members.

Loading and preparing data

Here, test data loading is done accessing the ECOMS User Data Gateway via the loadECOMS function from package ecomsUDG.Racess (registration and log in to the UDG Thredds Administration Portal is needed). Further examples are given in the ECOMS-UDG wiki.

####Observation data

In this example, VALUE ECA&D station data will be used as reference observation:

value <- "/home/maialen/WORK/LOCAL/downscaler_wiki/data/datasets/VALUE_ECA_86_v2"
download.file("http://meteo.unican.es/work/downscaler/data/VALUE_ECA_86_v2.tar.gz", 
              destfile = "mydirectory/VALUE_ECA_86_v2.tar.gz")
# Extract files from the tar.gz file
untar("mydirectory/VALUE_ECA_86_v2.tar.gz", exdir = "mydirectory")
# Data inventory
value <- "mydirectory/VALUE_ECA_86_v2"

di <-(dataInventory(value))

Next, we load station data within a given geographical bounding box, thus, the lonLim and latLim arguments are filled with a vector of length two, defining the corners of the bounding box. For instance:

obs <- loadStationData(dataset = value, 
                                var = c( "tmean"), 
                                lonLim = c(-11,5), 
                                latLim = c(35,46),
                                season= 6:8, 
                                years = 1991:2000)

str(obs)

We can plot the annual time series using function getYearsAsINDEX. Here we plot the time series for each loaded station (legend = station ID):

# Plotting of annual time series
yrs <- getYearsAsINDEX(obs)
plot.data <- sapply(1:ncol(obs$Data), function(x) {tapply(obs$Data[ ,x], INDEX = yrs, FUN = mean, na.rm = TRUE)})
colnames(plot.data)  <- obs$Metadata$station_id
plot(unique(yrs), plot.data[ ,1], ylim = c(floor(min(plot.data)) - 5, ceiling(max(plot.data))), ty = "b", ylab = "tasmax", xlab = "year")
for (i in 2:ncol(plot.data)) {
      lines(unique(yrs), plot.data[ ,i], col = i, ty = 'b')
}
legend("bottom", colnames(plot.data), col = 1:6, pch = 21, ncol = 2)
title("Mean maximum temperature summer (JJA)")

####Predictors

In this example, the NCEP reanalysis data will be used, that can be loaded accesing through the ECOMS User Data Gateway (ECOMS-UDG) or also accessing the netcdf files remotely through the corresponding ncml file (see section 2.2. Accessing and loading grid data). However, in this case, we are going to use a subdomain of the NCEP dataset encompassing the period 1961-2010 for the Iberian Peninsula, which is available in a tar.gz file for download:

download.file("http://meteo.unican.es/work/downscaler/data/Iberia_NCEP.tar.gz", 
              destfile = "data/datasets/Iberia_NCEP.tar.gz")
# Extract files from the tar.gz file
untar("mydirectory/Iberia_NCEP.tar.gz", exdir = "mydirectory")
# Define the path to the ncml file
ncep <- "mydirectory/Iberia_NCEP/Iberia_NCEP.ncml"
dir <- "/home/maialen/WORK/LOCAL/downscaler_wiki/"
ncep <- (paste(dir, "data/datasets/Iberia_NCEP/Iberia_NCEP.ncml", sep=""))

The set of predictors (a.k.a. atmospheric pattern) P5 is next loaded for the training period. A single variable for a specific spatiotemporal domain is what we term a field in downscaleR terminology. In this case, for the same spatio-temporal domain we are loading two different variables, that can be merged together forming a multifield. A multifield can be loaded from a given dataset using the loadMultiField function, that takes care of regridding to ensure the spatial matching of the different input fields. (NOTE: it is also possible to load the different fields sepparately and then join them using the multifield constructor makeMultiField).

In this example, we are interested in the variables of the atmospheric pattern P5 (mean surface temperature and sea-level pressure) whose standard names are "tas" and "psl" respectively, as indicated in the vocabulary (see help("vocabulary") for details).

pred <- loadMultiField(ncep, vars = c("tas","psl"), 
                       lonLim = c(-11,5), 
                       latLim = c(35,46), 
                       season = 6:8, 
                       years = 1991:2000)
# This is how the multifield looks like:
plotMeanField(pred)

Next, a PCA is performed on the calibration data retaining just the first 10 PCs (see also section 4.3. Principal Components (and EOFs)).

pred.pca <- prinComp(pred, n.eofs = 10)

####Test data (Seasonal forecasting)

Next, seasonal forecasting data from the System4 model of 15 members (dataset = "System4 seasonal 15") is loaded from the ECOMS User Data Gateway, considering the domain of analysis, the target season and the test period 2001-2010.

load("/home//maialen/WORK/LOCAL/downscaler_wiki/data/s4_psl_tas_iberia.rda")
library(ecomsUDG.Raccess)
loginECOMS_UDG(username = "usr", password = "pswrd")

psl.s4 <- loadECOMS(dataset = "System4_seasonal_15", 
                    var = "psl", 
                    lonLim = c(-11,5), 
                    latLim = c(35,46), 
                    season = 6:8, 
                    years = 2001:2010, 
                    time = "DD",
                    aggr.d = "mean",
                    leadMonth = 2)
tas.s4 <- loadECOMS(dataset = "System4_seasonal_15", 
                    var = "tas", 
                    lonLim = c(-11,5), 
                    latLim = c(35,46), 
                    season = 6:8, 
                    years = 2001:2010,  
                    time = "DD",
                    aggr.d = "mean",
                    leadMonth = 2)

# Example: visualization of a multimember
plotMeanField(psl.s4, multi.member = TRUE)

In the case of the predictors a multifield is loaded directly with downscaleR from the local dataset, in this other case, a single field (predictor) of test data is loaded each time from the ECOMS User Data Gateway, thus, we need to create a multifield joining the predictors with function makeMultiField. Prior to multifield creation, a re-gridding is performed with function interpGridData to the same grid of the predictors (function getGrid).

# Regridding to match predictors grid
psl.s4.2 <- interpData(psl.s4, new.Coordinates = getGrid(pred))
tas.s4.2 <- interpData(tas.s4, new.Coordinates =  getGrid(pred))
# Multifield creation
mf.test <- makeMultiField(psl.s4.2, tas.s4.2)

###Analog downscaling

The analog method is applied with the analog function to downscale the System4 seasonal forecast. The arguments follow the order observations > predictors > test data. Further additional arguments control the different options of the analog construction (type help("analogs") for more details):

an <- analogs(obs = obs, pred = pred.pca, sim = mf.test, n.neigh = 15, sel.fun = "random")

####Validation of results

We can visualize the downscaled forecast skill using the tercileValidation function. To this aim, we load the observation data in the test period:

val <- loadStationData(dataset = value, 
                       var = "tmax", 
                       lonLim = c(-11,5), 
                       latLim = c(35,46), 
                       season = 6:8, 
                       years = 2001:2010)
# This is a plot that performs the mean of all stations
tercileValidation(an, obs = val)
# This example enables the visualization of results for a particular station (e.g. Braganca):
tercileValidation(an, obs = val, stationId = "212")
title("Braganca")
#Annual time series plot
##Year index of the test period
years <- getYearsAsINDEX(val)

##Compute statistic
x <- tapply(val$Data[,1], INDEX = years, FUN = mean, na.rm = TRUE)
xs <- tapply(val$Data[,1], INDEX = years, FUN = sd, na.rm = TRUE)
y <- tapply(mf.test$Data[2,1,,3,3], INDEX = years, FUN = mean, na.rm = TRUE)
ys <- tapply(mf.test$Data[2,1,,3,3], INDEX = years, FUN = sd, na.rm = TRUE)
w <- tapply(an$Data[1,,1], INDEX = years, FUN = mean, na.rm = TRUE)
ws <- tapply(an$Data[1,,1], INDEX = years, FUN = sd, na.rm = TRUE)

plot(1:10, x, lwd = 2, ty = "l", xlab = "Years", ylim = c(15, 30),
ylab = "Annual mean of the Daily tp", cex = .6)
polygon(x = c(1:10, 10:1), y =c (xs+x,rev(x-xs)), col ="grey", border = NA)
polygon(x = c(1:10, 10:1), y =c (ys+y,rev(y-ys)), col = rgb(0,0,1,0.2), border = NA)
polygon(x =  c(1:10, 10:1), y =c (ws+w,rev(w-ws)), col = rgb(1,0,0,0.2), border = NA)
lines(1:10, x, lwd = 2, ty = "l")
lines(1:10, y, lwd = 2, ty = "l", col = "blue")
lines(1:10, w, col = "red", lwd = 2)
legend(6,30, legend = c("observed", "seasonal forecast", "downscaled"),
fill = c("black", "blue", "red"), box.lwd = 0, cex = .6)

<-- Home page of the Wiki

Clone this wiki locally