diff --git a/.Rbuildignore b/.Rbuildignore index c5b1d46..ad10e84 100644 --- a/.Rbuildignore +++ b/.Rbuildignore @@ -1,12 +1,10 @@ ^.*\.Rproj$ ^\.Rproj\.user$ .RData -README.html README.md pre_vignette .travis.yml logo.png frescalo.exe ^docs$ -vignettes\sparta_vignette.Rmd misc/ \ No newline at end of file diff --git a/.gitignore b/.gitignore index da8b20b..29e250a 100644 --- a/.gitignore +++ b/.gitignore @@ -10,4 +10,4 @@ frescalo.exe sparta.Rproj -misc/ \ No newline at end of file +misc/ diff --git a/DESCRIPTION b/DESCRIPTION index 647aca7..1e06b1c 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -1,13 +1,15 @@ Package: sparta Type: Package Title: Trend Analysis for Unstructured Data -Version: 0.1.49 -Date: 2019-03-07 +Version: 0.2.00 +Date: 2019-04-12 Authors@R: c(person("Tom", "August", role = c("aut", "cre"), email = "tomaug@ceh.ac.uk"), person("Gary", "Powney", role = c("aut")), person("Charlie", "Outhwaite", role = c("aut")), person("Colin", "Harrower", role = c("aut")), person("Mark", "Hill", role = c("aut")), + person("Jack", "Hatfield", role = c("aut")), + person("Francesca", "Mancini", role = c("aut")), person("Nick", "Isaac", role = c("aut"))) Maintainer: Tom August Description: Methods used to analyse trends in unstructured diff --git a/README.html b/README.html deleted file mode 100644 index 268709c..0000000 --- a/README.html +++ /dev/null @@ -1,182 +0,0 @@ - - - - - - -sparta - - - - - - - - - - -

sparta

- -

- -

This R package includes methods used to analyse trends in unstructured occurrence datasets and a range of useful functions for mapping such data in the UK. The package is currently under development.

- -

Tutorials

- -
- -

We have developed tutorials for a few of the functions in the package:

- - - -

Installation

- -
- -

To install the development version of sparta, it's easiest to use the devtools package:

- -
> # install.packages("devtools")
-> # NOTE: If you have not installed devtools before you will need to restart you R
-> # session before installing to avoid problems
->
-> library(devtools)
-> install_github("sparta", username = 'BiologicalRecordsCentre')
->
-> library(sparta)
-
- -

PLEASE NOTE THAT SINCE THIS PACKAGE IS IN DEVELOPMENT THE STRUCTURE AND FUNCTIONALITY OF THE PACKAGE ARE LIKELY TO CHANGE OVER TIME. WE WILL TRY TO KEEP THIS FRONT PAGE AND TUTORIALS UP TO DATE SO THAT IT WORKS WITH THE CURRENT MASTER VERSION ON GITHUB

- - - - - diff --git a/README.md b/README.md index 57decfb..8b2f114 100644 --- a/README.md +++ b/README.md @@ -6,12 +6,12 @@ This R package includes methods used to analyse trends in unstructured occurrence datasets and a range of useful functions for mapping such data in the UK. The package is currently **under development**. Note that frescalo currently uses an .exe compiled only for windows. -### News +News ---------------- We are in the process of re-writing much of sparta to add in things we learnt from our recent publication (Statistics for citizen science: extracting signals of change from noisy ecological data. 2014. Nick J. B. Isaac, Arco J. van Strien, Tom A. August, Marnix P. de Zeeuw and David B. Roy). Once the re-write is complete the package will go on CRAN. -### Installation +Installation ---------------- To **install** the development version of sparta, it's easiest to use the `devtools` package: @@ -40,7 +40,7 @@ library(sparta) If you have difficulties installing sparta using this method try updating your version of R to the most up-to-date version available. If you still have problems please contact us or use the issues page. -### Vignette/Tutorial +Vignette/Tutorial ---------------- We have written a vignette to support the package which can view [here](https://github.com/BiologicalRecordsCentre/sparta/raw/master/vignettes/sparta_vignette.pdf) diff --git a/docs/LICENSE-text.html b/docs/LICENSE-text.html new file mode 100644 index 0000000..6502c20 --- /dev/null +++ b/docs/LICENSE-text.html @@ -0,0 +1,155 @@ + + + + + + + + +License • sparta + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+
+ + + +
+ +
+
+ + +
MIT License
+
+Copyright (c) 2019 Centre for Ecology & Hydrology
+
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.
+
+ +
+ +
+ + +
+ + +
+

Site built with pkgdown 1.3.0.

+
+
+
+ + + + + + diff --git a/docs/LICENSE.html b/docs/LICENSE.html deleted file mode 100644 index c6f9af3..0000000 --- a/docs/LICENSE.html +++ /dev/null @@ -1,115 +0,0 @@ - - - - - - - - -License • sparta - - - - - - - - - - - - - - - - - - - - - - - - - - - -
-
- - - -
- -
-
- - -
YEAR: 2015
-COPYRIGHT HOLDER: Centre for Ecology and Hydrology (CEH)
-
- -
- -
- - -
- - -
-

Site built with pkgdown.

-
- -
-
- - - diff --git a/docs/_config.yml b/docs/_config.yml deleted file mode 100644 index 2f7efbe..0000000 --- a/docs/_config.yml +++ /dev/null @@ -1 +0,0 @@ -theme: jekyll-theme-minimal \ No newline at end of file diff --git a/docs/articles/index.html b/docs/articles/index.html index d1da8f4..5463fce 100644 --- a/docs/articles/index.html +++ b/docs/articles/index.html @@ -1,6 +1,6 @@ - + @@ -9,24 +9,37 @@ Articles • sparta - + - - + + - + + + + + + - - - + + + + + + + + + + - + + @@ -84,18 +103,18 @@ - -
-
+ @@ -103,15 +122,17 @@

All vignettes

-

Site built with pkgdown.

+

Site built with pkgdown 1.3.0.

-
+ + + diff --git a/docs/articles/sparta_vignette.html b/docs/articles/sparta_vignette.html index b9b2de2..26b9146 100644 --- a/docs/articles/sparta_vignette.html +++ b/docs/articles/sparta_vignette.html @@ -1,31 +1,40 @@ - + -sparta - Species Presence/Absence R Trends Analyses • sparta - - - - + + + + + + + -
+
-
+
+ + + + +
-
-

Introduction

@@ -88,16 +94,17 @@

NOTE: JAGS must be installed before the R package installation will work. JAGS can be found here - http://sourceforge.net/projects/mcmc-jags/files/JAGS/

# Install the package from CRAN
 # THIS WILL WORK ONLY AFTER THE PACKAGE IS PUBLISHED
-install.packages('sparta')
+install.packages('sparta')
 
 # Or install the development version from GitHub
-library(devtools)
+library(devtools)
 install_github('biologicalrecordscentre/sparta')
# Once installed, load the package
-library(sparta)
-
## Loading required package: lme4
-
## Loading required package: Matrix
-

The functions in sparta cover a range of tasks. Primarily they are focused on analysing trends in species occurrence data while accounting for biases (see Isaac et al, 2014). In this vignette we step through these functions and others so that you can understand how the package works. If you have any questions you can find the package maintainers email address using maintainer('sparta'), and if you have issues or bugs you can report them here

+library(sparta)

+
## Loading required package: lme4
+## Loading required package: Matrix
+## Loading required package: Rcpp
+

The functions in sparta cover a range of tasks. Primarily they are focused on analysing trends in species occurrence data while accounting for biases (see Isaac et al, 2014). In this vignette we step through these functions and others so that you can understand how the package works. If you have any questions you can find the package maintainers email address using maintainer('sparta'), and if you have issues or bugs you can report them here

@@ -113,30 +120,30 @@

nyr <- 50 # number of years in data nSamples <- 200 # set number of dates nSites <- 100 # set number of sites -set.seed(125) # set a random seed +set.seed(125) # set a random seed # Create somes dates -first <- as.Date(strptime("1950/01/01", "%Y/%m/%d")) -last <- as.Date(strptime(paste(1950+(nyr-1),"/12/31", sep=''), "%Y/%m/%d")) +first <- as.Date(strptime("1950/01/01", "%Y/%m/%d")) +last <- as.Date(strptime(paste(1950+(nyr-1),"/12/31", sep=''), "%Y/%m/%d")) dt <- last-first -rDates <- first + (runif(nSamples)*dt) +rDates <- first + (runif(nSamples)*dt) # taxa are set semi-randomly -taxa_probabilities <- seq(from = 0.1, to = 0.7, length.out = 26) -taxa <- sample(letters, size = n, TRUE, prob = taxa_probabilities) +taxa_probabilities <- seq(from = 0.1, to = 0.7, length.out = 26) +taxa <- sample(letters, size = n, TRUE, prob = taxa_probabilities) # sites are visited semi-randomly -site_probabilities <- seq(from = 0.1, to = 0.7, length.out = nSites) -site <- sample(paste('A', 1:nSites, sep=''), size = n, TRUE, prob = site_probabilities) +site_probabilities <- seq(from = 0.1, to = 0.7, length.out = nSites) +site <- sample(paste('A', 1:nSites, sep=''), size = n, TRUE, prob = site_probabilities) # the date of visit is selected semi-randomly from those created earlier -time_probabilities <- seq(from = 0.1, to = 0.7, length.out = nSamples) -time_period <- sample(rDates, size = n, TRUE, prob = time_probabilities) +time_probabilities <- seq(from = 0.1, to = 0.7, length.out = nSamples) +time_period <- sample(rDates, size = n, TRUE, prob = time_probabilities) -myData <- data.frame(taxa, site, time_period) +myData <- data.frame(taxa, site, time_period) # Let's have a look at the my example data -head(myData)

+head(myData)
##   taxa site time_period
 ## 1    r  A51  1970-01-14
 ## 2    v  A87  1980-09-29
@@ -158,7 +165,9 @@ 

progress_bar = FALSE)

## Warning in errorChecks(taxa = taxa, site = site, time_period =
 ## time_period): 94 out of 8000 observations will be removed as duplicates
-

+
+ +
## ## Linear model outputs ##
 ## 
 ## There is no detectable change in the number of records over time:
@@ -178,11 +187,13 @@ 

# is set to be a year results <- dataDiagnostics(taxa = myData$taxa, site = myData$site, - time_period = as.numeric(format(myData$time_period, '%Y')), + time_period = as.numeric(format(myData$time_period, '%Y')), progress_bar = FALSE)

## Warning in errorChecks(taxa = taxa, site = site, time_period =
 ## time_period): 419 out of 8000 observations will be removed as duplicates
-

+
+ +
## ## Linear model outputs ##
 ## 
 ## There is no detectable change in the number of records over time:
@@ -199,14 +210,14 @@ 

## time_period 0.0007201245 0.0007874907 0.9144546 0.3604780

If we want to view these results in more detail we can interrogate the object results

# See what is in results..
-names(results)
+names(results)
## [1] "RecordsPerYear"  "VisitListLength" "modelRecs"       "modelList"
# Let's have a look at the details
-head(results$RecordsPerYear)
+head(results$RecordsPerYear)
## RecordsPerYear
 ## 1950 1951 1952 1953 1954 1955 
 ##  224   69  147  181  119  218
-
head(results$VisitListLength)
+
head(results$VisitListLength)
##   time_period site listLength
 ## 1        1950 A100          3
 ## 2        1950  A11          1
@@ -214,7 +225,7 @@ 

## 4 1950 A13 1 ## 5 1950 A15 1 ## 6 1950 A16 2

-
summary(results$modelRecs)
+
summary(results$modelRecs)
## 
 ## Call:
 ## glm(formula = count ~ time_period, data = mData)
@@ -235,7 +246,7 @@ 

## AIC: 594.01 ## ## Number of Fisher Scoring iterations: 2

-
summary(results$modelList)
+
summary(results$modelList)
## 
 ## Call:
 ## glm(formula = listLength ~ time_period, family = "poisson", data = space_time)
@@ -264,8 +275,8 @@ 

Our data is not quite in the correct format for Telfer since it is used to compare time periods but our time_period column is a date. We can fix this by using the date2timeperiod function.

## Create a new column for the time period
 # First define my time periods
-time_periods <- data.frame(start = c(1950, 1960, 1970, 1980, 1990),
-                           end = c(1959, 1969, 1979, 1989, 1999))
+time_periods <- data.frame(start = c(1950, 1960, 1970, 1980, 1990),
+                           end = c(1959, 1969, 1979, 1989, 1999))
 
 time_periods
##   start  end
@@ -277,7 +288,7 @@ 

# Now use these to assign my dates to time periods
 myData$tp <- date2timeperiod(myData$time_period, time_periods)
 
-head(myData)
+head(myData)

##   taxa site time_period tp
 ## 1    r  A51  1970-01-14  3
 ## 2    v  A87  1980-09-29  4
@@ -287,10 +298,10 @@ 

## 6 x A48 1990-02-25 5

As you can see our new column indicates which time period each date falls into with 1 being the earliest time period, 2 being the second and so on. This function will also work if instead of a single date for each record you have a date range

## Create a dataset where we have date ranges
-Date_range <- data.frame(startdate = myData$time_period,
+Date_range <- data.frame(startdate = myData$time_period,
                          enddate = (myData$time_period + 600))
 
-head(Date_range)
+head(Date_range)

##    startdate    enddate
 ## 1 1970-01-14 1971-09-06
 ## 2 1980-09-29 1982-05-22
@@ -301,7 +312,7 @@ 

# Now assign my date ranges to time periods
 Date_range$time_period <- date2timeperiod(Date_range, time_periods)
 
-head(Date_range)
+head(Date_range)

##    startdate    enddate time_period
 ## 1 1970-01-14 1971-09-06           3
 ## 2 1980-09-29 1982-05-22           4
@@ -312,7 +323,7 @@ 

As you can see in this example when a date range spans the boundaries of your time periods NA is returned.

Now we have our data in the right format we can use the telfer function to analyse the data. The Telfer index for each species is the standardized residual from a linear regression across all species and is a measure of relative change only as the average real trend across species is obscured (Isaac et al (2014); Telfer et al, 2002).Telfer is used for comparing two time periods and if you have more than this the telfer function will all pair-wise comparisons.

# Here is our data
-head(myData)
+head(myData)

##   taxa site time_period tp
 ## 1    r  A51  1970-01-14  3
 ## 2    v  A87  1980-09-29  4
@@ -326,66 +337,23 @@ 

minSite = 2)

## Warning in errorChecks(taxa = taxa, site = site, time_period =
 ## time_period, : 2541 out of 8000 observations will be removed as duplicates
-
## Warning in merge.data.frame(a, b, all = TRUE, by = "taxa"): column names
-## 'Nsite_1.x', 'Nsite_1.y' are duplicated in the result
-
-## Warning in merge.data.frame(a, b, all = TRUE, by = "taxa"): column names
-## 'Nsite_1.x', 'Nsite_1.y' are duplicated in the result
-
-## Warning in merge.data.frame(a, b, all = TRUE, by = "taxa"): column names
-## 'Nsite_1.x', 'Nsite_1.y' are duplicated in the result
-
## Warning in merge.data.frame(a, b, all = TRUE, by = "taxa"): column names
-## 'Nsite_1.x', 'Nsite_1.y', 'Nsite_2.x', 'Nsite_2.y' are duplicated in the
-## result
-
-## Warning in merge.data.frame(a, b, all = TRUE, by = "taxa"): column names
-## 'Nsite_1.x', 'Nsite_1.y', 'Nsite_2.x', 'Nsite_2.y' are duplicated in the
-## result
-
## Warning in merge.data.frame(a, b, all = TRUE, by = "taxa"): column
-## names 'Nsite_1.x', 'Nsite_1.y', 'Nsite_2.x', 'Nsite_2.y', 'Nsite_3.x',
-## 'Nsite_3.y' are duplicated in the result
-
## Warning in merge.data.frame(a, b, all = TRUE, by = "taxa"): column
-## names 'Nsite_1.x', 'Nsite_1.y', 'Nsite_2.x', 'Nsite_2.y', 'Nsite_3.x',
-## 'Nsite_4.x', 'Nsite_3.y', 'Nsite_5.x', 'Nsite_4.y', 'Nsite_5.y' are
-## duplicated in the result

We get a warning message indicating that a large number of rows are being removed as duplicates. This occurs since we are now aggregating records into time periods and therefore creating a large number of duplicates.

The results give the change index for each species (rows) in each of the pairwise comparisons of time periods (columns).

-
head(telfer_results)
-
##   taxa Nsite_1.x Nsite_2.x  Telfer_1_2 Nsite_1.y Nsite_3.x   Telfer_1_3
-## 1    a        13        16 -0.67842545        13        13 -1.744577671
-## 2    b        18        19 -0.90368128        18        21 -0.841219630
-## 3    c        16        22  0.96096754        16        22 -0.008737329
-## 4    d        17        23  0.79744179        17        21 -0.558165922
-## 5    e        23        27 -0.01856808        23        32  0.490523483
-## 6    f        28        28 -0.80201507        28        33 -0.412461197
-##   Nsite_1.x Nsite_4.x Telfer_1_4 Nsite_1.y Nsite_5.x Telfer_1_5 Nsite_2.y
-## 1        13         8 -1.8073843        13        17 -0.7000801        17
-## 2        18        16 -0.8697828        18        18 -1.5449132        20
-## 3        16        19  0.2181534        16        23  0.3726534        22
-## 4        17        21  0.3848417        17        30  1.6642357        23
-## 5        23        20 -1.0901348        23        21 -1.6500473        27
-## 6        28        25 -1.0846426        28        38  0.3817399        29
-##   Nsite_3.y Telfer_2_3 Nsite_2.x Nsite_4.y Telfer_2_4 Nsite_2.y Nsite_5.y
-## 1        13 -1.8352888        17         8 -2.1097232        17        17
-## 2        21 -0.5139840        20        16 -0.6234749        20        18
-## 3        22 -0.7254485        22        19 -0.3891040        22        23
-## 4        21 -1.1759409        23        21 -0.1875890        23        30
-## 5        32  0.3450083        27        20 -1.1254544        27        21
-## 6        33  0.1657078        29        25 -0.5122655        29        38
-##   Telfer_2_5 Nsite_3.x Nsite_4.x Telfer_3_4 Nsite_3.y Nsite_5.x Telfer_3_5
-## 1 -0.4557972        13         8 -1.1728237        13        17  0.8437536
-## 2 -0.8326960        21        16 -0.3171487        21        18 -1.1756988
-## 3 -0.3595835        22        19  0.3549603        22        23 -0.2184517
-## 4  0.5294236        21        21  1.2663488        21        30  1.3562488
-## 5 -1.7153826        32        20 -1.8881411        32        21 -2.1972910
-## 6  0.8827473        33        25 -0.8383498        33        38  0.4662370
-##   Nsite_4.y Nsite_5.y Telfer_4_5
-## 1         8        17  1.4880569
-## 2        16        18 -0.8995878
-## 3        19        23 -0.3834038
-## 4        21        30  0.6466352
-## 5        20        21 -1.0810351
-## 6        25        38  1.3111555
+
head(telfer_results)
+
##   taxa  Telfer_1_2   Telfer_1_3 Telfer_1_4 Telfer_1_5 Telfer_2_3
+## 1    a -0.67842545 -1.744577671 -1.8073843 -0.7000801 -1.8352888
+## 2    b -0.90368128 -0.841219630 -0.8697828 -1.5449132 -0.5139840
+## 3    c  0.96096754 -0.008737329  0.2181534  0.3726534 -0.7254485
+## 4    d  0.79744179 -0.558165922  0.3848417  1.6642357 -1.1759409
+## 5    e -0.01856808  0.490523483 -1.0901348 -1.6500473  0.3450083
+## 6    f -0.80201507 -0.412461197 -1.0846426  0.3817399  0.1657078
+##   Telfer_2_4 Telfer_2_5 Telfer_3_4 Telfer_3_5 Telfer_4_5
+## 1 -2.1097232 -0.4557972 -1.1728237  0.8437536  1.4880569
+## 2 -0.6234749 -0.8326960 -0.3171487 -1.1756988 -0.8995878
+## 3 -0.3891040 -0.3595835  0.3549603 -0.2184517 -0.3834038
+## 4 -0.1875890  0.5294236  1.2663488  1.3562488  0.6466352
+## 5 -1.1254544 -1.7153826 -1.8881411 -2.1972910 -1.0810351
+## 6 -0.5122655  0.8827473 -0.8383498  0.4662370  1.3111555

@@ -405,7 +373,7 @@

minL = 2)

## Warning in errorChecks(taxa, site, time_period): 94 out of 8000
 ## observations will be removed as duplicates
-
head(myDataL)
+
head(myDataL)
##   taxa site time_period
 ## 1    u   A1  1952-11-16
 ## 2    n   A1  1952-11-16
@@ -414,9 +382,9 @@ 

## 5 x A1 1999-08-03 ## 6 d A1 1999-08-03

# We now have a much smaller dataset after subsetting
-nrow(myData)
+nrow(myData)
## [1] 8000
-
nrow(myDataL)
+
nrow(myDataL)
## [1] 3082

We are also able to subset by the number of times a site is sampled. The function siteSelectionMinTP does this. When time_period is a date, as in this case, minTP is minimum number of years a site must be sampled in for it be included in the subset.

# Select only data from sites sampled in at least 10 years
@@ -426,7 +394,7 @@ 

minTP = 10)

## Warning in errorChecks(taxa, site, time_period): 94 out of 8000
 ## observations will be removed as duplicates
-
head(myDataTP)
+
head(myDataTP)
##   taxa site time_period
 ## 1    r  A51  1970-01-14
 ## 2    v  A87  1980-09-29
@@ -437,22 +405,22 @@ 

# Here we have only lost a small number rows, this is because
 # many sites in our data are visited in a lot of years. Those
 # rows that have been removed are duplicates
-nrow(myData)
+nrow(myData)

## [1] 8000
-
nrow(myDataTP)
+
nrow(myDataTP)
## [1] 7906

As you can see in the above example minTP specifies the number of years a site must be sampled in order to be included. However, our dataset is very well sampled so we might be interested in another measure of time. For example, you might want only sites that have been observed in at least 60 months. Let’s see how this could be done.

# We need to create a new column to represent unique months
 # this could also be any unit of time you wanted (week, decade, etc.)
 
 # This line returns a unique character for each month
-unique_Months <- format(myData$time_period, "%B_%Y")
-head(unique_Months)
+unique_Months <- format(myData$time_period, "%B_%Y") +head(unique_Months)
## [1] "January_1970"   "September_1980" "April_1996"     "January_1959"  
 ## [5] "September_1970" "February_1990"
# Week could be done like this, see ?strptime for more details
-unique_Weeks <- format(myData$time_period, "%U_%Y")
-head(unique_Weeks)
+unique_Weeks <- format(myData$time_period, "%U_%Y") +head(unique_Weeks)
## [1] "02_1970" "39_1980" "15_1996" "02_1959" "38_1970" "08_1990"
# Now lets subset to records found on 60 months or more
 myData60Months <- siteSelectionMinTP(taxa = myData$taxa,
@@ -461,7 +429,7 @@ 

minTP = 60)

## Warning in errorChecks(taxa, site, time_period): 129 out of 8000
 ## observations will be removed as duplicates
-
head(myData60Months)
+
head(myData60Months)
##   taxa site    time_period
 ## 1    r  A51   January_1970
 ## 2    v  A87 September_1980
@@ -471,10 +439,10 @@ 

## 7 t A59 January_1981

# We could merge this back with our original data if
 # we need to retain the full dates
-myData60Months <- merge(myData60Months, myData$time_period, 
+myData60Months <- merge(myData60Months, myData$time_period, 
                         all.x = TRUE, all.y = FALSE,
                         by = "row.names")
-head(myData60Months)
+head(myData60Months)
##   Row.names taxa site  time_period          y
 ## 1         1    r  A51 January_1970 1970-01-14
 ## 2        10    w  A81    June_1982 1982-06-19
@@ -482,9 +450,9 @@ 

## 4 1000 h A94 May_1990 1981-01-17 ## 5 1001 m A73 March_1999 1990-05-18 ## 6 1002 b A59 July_1997 1999-03-05

-
nrow(myData)
+
nrow(myData)
## [1] 8000
-
nrow(myData60Months)
+
nrow(myData60Months)
## [1] 5289

Following the method in Roy et al (2012) we can combine these two functions to subset both by the length of lists and by the number of years that sites are sampled. This has been wrapped up in to the function siteSelection which takes all the arguments of the previous two functions plus the argument LFirst which indicates whether the data should be subset by list length first (TRUE) or second (FALSE).

# Subset our data as above but in one go
@@ -496,7 +464,7 @@ 

LFirst = TRUE)

## Warning in errorChecks(taxa, site, time_period): 94 out of 8000
 ## observations will be removed as duplicates
-
head(myDataSubset)
+
head(myDataSubset)
##    taxa site time_period
 ## 11    y A100  1950-01-04
 ## 12    k A100  1950-01-04
@@ -504,7 +472,7 @@ 

## 14 o A100 1954-01-30 ## 15 s A100 1954-01-30 ## 16 m A100 1956-02-02

-
nrow(myDataSubset)
+
nrow(myDataSubset)
## [1] 2587
@@ -513,13 +481,13 @@

Once you have subset your data using the above functions (or perhaps not at all) the reporting rate models can be applied using the function reportingRateModel. This function offers flexibility in the model you wish to fit, allowing the user to specify whether list length and site should be used as covariates, whether over-dispersion should be used, and whether the family should be binomial or Bernoulli. A number of these variants are presented in Isaac et al (2014). While multi-species data is required it is not nessecary to model all species. In fact you can save a significant amount of time by only modelling hte species you are interested in.

# Run the reporting rate model using list length as a fixed effect and 
 # site as a random effect. Here we only model a few species.
-system.time({
+system.time({
 RR_out <- reportingRateModel(taxa = myData$taxa,
                              site = myData$site,
                              time_period = myData$time_period,
                              list_length = TRUE,
                              site_effect = TRUE,
-                             species_to_include = c('e','u','r','o','t','a','s'),
+                             species_to_include = c('e','u','r','o','t','a','s'),
                              overdispersion = FALSE,
                              family = 'Bernoulli',
                              print_progress = TRUE)
@@ -534,9 +502,9 @@ 

## Modelling a - Species 6 of 7 ## Modelling s - Species 7 of 7

##    user  system elapsed 
-##   25.91    0.00   27.97
+## 11.44 0.00 11.46
# Let's have a look at the data that is returned
-str(RR_out)
+str(RR_out)
## 'data.frame':    7 obs. of  14 variables:
 ##  $ species_name       : Factor w/ 7 levels "e","u","r","o",..: 1 2 3 4 5 6 7
 ##  $ intercept.estimate : num  -4.53 -3.52 -3.32 -3.63 -3.68 ...
@@ -548,9 +516,9 @@ 

## $ intercept.zvalue : num -24.5 -31.3 -27.1 -28.2 -31.6 ... ## $ year.zvalue : num -0.9961 -2.0324 -0.9244 0.0688 -1.1177 ... ## $ listlength.zvalue : num 5.25 10.65 6.26 7.54 10.49 ... -## $ intercept.pvalue : num 2.30e-132 1.55e-214 6.09e-162 1.07e-174 3.77e-219 ... +## $ intercept.pvalue : num 2.34e-132 1.58e-214 6.06e-162 1.07e-174 3.78e-219 ... ## $ year.pvalue : num 0.3192 0.0421 0.3553 0.9452 0.2637 ... -## $ listlength.pvalue : num 1.51e-07 1.67e-26 3.76e-10 4.78e-14 9.57e-26 ... +## $ listlength.pvalue : num 1.51e-07 1.68e-26 3.76e-10 4.78e-14 9.57e-26 ... ## $ observations : num 144 450 398 346 398 73 426 ## - attr(*, "intercept_year")= num 1974 ## - attr(*, "min_year")= num -24.5 @@ -558,39 +526,45 @@

## - attr(*, "nVisits")= int 6211 ## - attr(*, "model_formula")= chr "taxa ~ year + listLength + (1|site)"

# We could plot these to see the species trends
-with(RR_out,
+with(RR_out,
      # Plot graph
-     {plot(x = 1:7, y = year.estimate,
-           ylim = range(c(year.estimate - year.stderror,
+     {plot(x = 1:7, y = year.estimate,
+           ylim = range(c(year.estimate - year.stderror,
                           year.estimate + year.stderror)),
            ylab = 'Year effect (+/- Std Dev)',
            xlab = 'Species',
            xaxt = "n")
      # Add x-axis with species names
-     axis(1, at = 1:7, labels = species_name)
+     axis(1, at = 1:7, labels = species_name)
      # Add the error bars
-     arrows(1:7, year.estimate - year.stderror,
+     arrows(1:7, year.estimate - year.stderror,
             1:7, year.estimate + year.stderror,
             length = 0.05, angle = 90, code = 3)}
      )
-

+
+ +

The returned object is a data frame with one row per species. Each column gives information on an element of the model output including covariate estimates, standard errors and p-values. This object also has some attributes giving the year that was chosen as the intercept, the number of visits in the dataset and the model formula used.

These models can take a long time to run when your data set is larg or you have a large number of species to model. To make this faster it is possible to parallelise this process across species which can significantly improve your run times. Here is an example of how we would parallelise the above example using hte R package snowfall.

# Load in snowfall
-library(snowfall)
+library(snowfall)

## Loading required package: snow
# I have 4 cpus on my PC so I set cpus to 4
 # when I initialise the cluster
-sfInit(parallel = TRUE, cpus = 4)
-
## R Version:  R version 3.4.1 (2017-06-30)
-
## snowfall 1.84-6.1 initialized (using snow 0.4-2): parallel execution on 4 CPUs.
+sfInit(parallel = TRUE, cpus = 4) +
## Warning in searchCommandline(parallel, cpus = cpus, type = type,
+## socketHosts = socketHosts, : Unknown option on commandline:
+## rmarkdown::render('W:/PYWELL_SHARED/Pywell Projects/BRC/Tom August/R
+## Packages/Trend analyses/sparta/pre_vignette/sparta_vignette.Rmd', encoding
+
## R Version:  R version 3.2.0 (2015-04-16)
+
## snowfall 1.84-6 initialized (using snow 0.3-13): parallel execution on 4 CPUs.
# Export my data to the cluster
-sfExport('myData')
+sfExport('myData')
 
 # I create a function that takes a species name and runs my models
 RR_mod_function <- function(taxa_name){
   
-  library(sparta)
+  library(sparta)
   
   RR_out <- reportingRateModel(species_to_include = taxa_name,
                                taxa = myData$taxa,
@@ -604,22 +578,22 @@ 

} # I then run this in parallel -system.time({ -para_out <- sfClusterApplyLB(c('e','u','r','o','t','a','s'), RR_mod_function) +system.time({ +para_out <- sfClusterApplyLB(c('e','u','r','o','t','a','s'), RR_mod_function) })

##    user  system elapsed 
-##    0.00    0.00   15.43
+## 0.00 0.00 7.21
# Name each element of this output by the species
-RR_out_combined <- do.call(rbind, para_out)
+RR_out_combined <- do.call(rbind, para_out)
 
 # Stop the cluster
-sfStop()
+sfStop()
## 
 ## Stopping cluster
# You'll see the output is the same as when we did it serially but the
 # time taken is shorter. Using a cluster computer with many more than 
 # 4 cores can greatly reduce run time.
-str(RR_out_combined)
+str(RR_out_combined)
## 'data.frame':    7 obs. of  14 variables:
 ##  $ species_name       : Factor w/ 7 levels "e","u","r","o",..: 1 2 3 4 5 6 7
 ##  $ intercept.estimate : num  -4.53 -3.52 -3.32 -3.63 -3.68 ...
@@ -631,9 +605,9 @@ 

## $ intercept.zvalue : num -24.5 -31.3 -27.1 -28.2 -31.6 ... ## $ year.zvalue : num -0.9961 -2.0324 -0.9244 0.0688 -1.1177 ... ## $ listlength.zvalue : num 5.25 10.65 6.26 7.54 10.49 ... -## $ intercept.pvalue : num 2.30e-132 1.55e-214 6.09e-162 1.07e-174 3.77e-219 ... +## $ intercept.pvalue : num 2.34e-132 1.58e-214 6.06e-162 1.07e-174 3.78e-219 ... ## $ year.pvalue : num 0.3192 0.0421 0.3553 0.9452 0.2637 ... -## $ listlength.pvalue : num 1.51e-07 1.67e-26 3.76e-10 4.78e-14 9.57e-26 ... +## $ listlength.pvalue : num 1.51e-07 1.68e-26 3.76e-10 4.78e-14 9.57e-26 ... ## $ observations : num 144 450 398 346 398 73 426 ## - attr(*, "intercept_year")= num 1974 ## - attr(*, "min_year")= num -24.5 @@ -652,7 +626,7 @@

## Warning in errorChecks(taxa, site, time_period): 94 out of 8000
 ## observations will be removed as duplicates
# The data is returned in the same format as from reportingRateModel
-str(WSS_out)
+str(WSS_out)

## 'data.frame':    26 obs. of  10 variables:
 ##  $ species_name      : Factor w/ 26 levels "r","v","e","z",..: 1 2 3 4 5 6 7 8 9 10 ...
 ##  $ intercept.estimate: num  -2.29 -1.85 -3.17 -1.81 -1.75 ...
@@ -661,7 +635,7 @@ 

## $ year.stderror : num 0.00684 0.00574 0.00973 0.00565 0.00554 ... ## $ intercept.zvalue : num -22.4 -21.5 -16.9 -21.3 -21.1 ... ## $ year.zvalue : num -1.334 0.208 0.163 0.253 -0.446 ... -## $ intercept.pvalue : num 1.70e-111 1.66e-102 6.54e-64 6.87e-101 1.06e-98 ... +## $ intercept.pvalue : num 1.70e-111 1.66e-102 6.55e-64 6.87e-101 1.06e-98 ... ## $ year.pvalue : num 0.182 0.835 0.871 0.8 0.656 ... ## $ observations : num 106 157 50 163 171 148 125 155 61 104 ... ## - attr(*, "intercept_year")= num 1974 @@ -673,22 +647,24 @@

## - attr(*, "minTP")= num 10

# We can plot these and see that we get different results to our
 # previous analysis since this time the method includes subsetting
-with(WSS_out[1:10,],
+with(WSS_out[1:10,],
      # Plot graph
-     {plot(x = 1:10, y = year.estimate,
-           ylim = range(c(year.estimate - year.stderror,
+     {plot(x = 1:10, y = year.estimate,
+           ylim = range(c(year.estimate - year.stderror,
                           year.estimate + year.stderror)),
            ylab = 'Year effect (+/- Std Dev)',
            xlab = 'Species',
            xaxt="n")
      # Add x-axis with species names
-     axis(1, at=1:10, labels = species_name[1:10])
+     axis(1, at=1:10, labels = species_name[1:10])
      # Add the error bars
-     arrows(1:10, year.estimate - year.stderror,
+     arrows(1:10, year.estimate - year.stderror,
             1:10, year.estimate + year.stderror,
             length=0.05, angle=90, code=3)}
      )
-

+
+ +
@@ -697,7 +673,7 @@

Occupancy models were found by Isaac et al (2014) to be one of the best tools for analysing species occurrence data typical of citizen science projects, being both robust and powerful. This method models the occupancy process separately from the detection process, but we will not go in to the details of the model here since there is a growing literature about occupancy models, how and when they should be used. Here we focus on how the occupancy model discussed in Isaac et al 2014 is implemented in sparta.

This function works in a very similar fashion to that of the previous functions we have discussed. The data it takes is ‘What, where, when’ as in other functions, however here we have the option to specify which species we wish to model. This feature has been added as occupancy models are computationally intensive. The parameters of the function allow you control over the number of iterations, burnin, thinning, the number of chains, the seed and for advanced users there is also the possibility to pass in your own BUGS script.

# Here is our data
-str(myData)
+str(myData)

## 'data.frame':    8000 obs. of  4 variables:
 ##  $ taxa       : Factor w/ 26 levels "a","b","c","d",..: 18 22 5 26 18 24 20 24 17 23 ...
 ##  $ site       : Factor w/ 100 levels "A1","A10","A100",..: 48 87 53 22 76 44 56 66 92 81 ...
@@ -706,11 +682,11 @@ 

# Run an occupancy model for three species
 # Here we use very small number of iterations 
 # to avoid a long run time
-system.time({
+system.time({
 occ_out <- occDetModel(taxa = myData$taxa,
                        site = myData$site,
                        time_period = myData$time_period,
-                       species_list = c('a','b','c','d'),
+                       species_list = c('a','b','c','d'),
                        write_results = FALSE,
                        n_iterations = 200,
                        burnin = 15,
@@ -722,503 +698,82 @@ 

## time_period): 94 out of 8000 observations will be removed as duplicates

## 
 ## ###
-## Modeling a - 1 of 4 taxa
-## #### PLEASE REVIEW THE BELOW ####
-## 
-## Your model settings: sparta, contlistlength
-## 
-## Model File:
-## 
-## model{
-##  #######################################
-## # SPARTA model from GitHub 26/05/2016 #
-## 
-## # State model
-## for (i in 1:nsite){ 
-##   for (t in 1:nyear){   
-##     z[i,t] ~ dbern(muZ[i,t]) 
-##     logit(muZ[i,t])<- a[t] + eta[i] 
-##   }}  
-## 
-## # Priors 
-## # State model priors
-## for(t in 1:nyear){
-##   a[t] ~ dunif(-10,10)   
-## }                 
-## 
-## # RANDOM EFFECT for SITE
-## for (i in 1:nsite) {
-##   eta[i] ~ dnorm(0, tau2)       
-## } 
-## 
-## tau2 <- 1/(sigma2 * sigma2) 
-## sigma2 ~ dunif(0, 5)
-## 
-## 
-## # Observation model priors 
-## for (t in 1:nyear) {
-##   alpha.p[t] ~ dnorm(mu.lp, tau.lp)            
-## }
-## 
-## mu.lp ~ dnorm(0, 0.01)                         
-## tau.lp <- 1 / (sd.lp * sd.lp)                 
-## sd.lp ~ dunif(0, 5)   
-## 
-## # Derived parameters state model
-## 
-## # Finite sample occupancy
-## for (t in 1:nyear) {  
-##   psi.fs[t] <- sum(z[1:nsite,t])/nsite
-## }  LL.p ~ dunif(dtype2p_min, dtype2p_max)
-## ### Observation Model
-## for(j in 1:nvisit) {
-##   y[j] ~ dbern(Py[j])
-##   Py[j]<- z[Site[j],Year[j]]*p[j]
-##   logit(p[j]) <-  alpha.p[Year[j]] + LL.p*logL[j]
-## } }
-## 
-## bugs_data:
-## 
-## List of 9
-##  $ y          : num [1:6211] 0 0 0 0 0 0 0 0 0 0 ...
-##  $ Year       : num [1:6211] 3 7 10 11 12 12 13 14 16 19 ...
-##  $ Site       : int [1:6211] 1 1 1 1 1 1 1 1 1 1 ...
-##  $ nyear      : num 50
-##  $ nsite      : int 100
-##  $ nvisit     : int 6211
-##  $ logL       : num [1:6211] 0.693 0 0 0.693 0 ...
-##  $ dtype2p_min: num -10
-##  $ dtype2p_max: num 10
-## 
-## 
-## init.vals:
-## 
-## List of 3
-##  $ :List of 5
-##   ..$ z      : num [1:100, 1:50] 0 0 1 0 0 0 0 0 0 0 ...
-##   .. ..- attr(*, "dimnames")=List of 2
-##   .. .. ..$ : chr [1:100] "1" "2" "3" "4" ...
-##   .. .. ..$ : chr [1:50] "1" "2" "3" "4" ...
-##   ..$ alpha.p: num [1:50] -0.85 -0.85 -0.85 -0.85 -0.85 ...
-##   ..$ a      : num [1:50] 1.15 1.15 1.15 1.15 1.15 ...
-##   ..$ eta    : num [1:100] -0.364 -0.364 -0.364 -0.364 -0.364 ...
-##   ..$ LL.p   : num 1.53
-##  $ :List of 5
-##   ..$ z      : num [1:100, 1:50] 0 0 1 0 0 0 0 0 0 0 ...
-##   .. ..- attr(*, "dimnames")=List of 2
-##   .. .. ..$ : chr [1:100] "1" "2" "3" "4" ...
-##   .. .. ..$ : chr [1:50] "1" "2" "3" "4" ...
-##   ..$ alpha.p: num [1:50] 1.76 1.76 1.76 1.76 1.76 ...
-##   ..$ a      : num [1:50] -1.82 -1.82 -1.82 -1.82 -1.82 ...
-##   ..$ eta    : num [1:100] 0.112 0.112 0.112 0.112 0.112 ...
-##   ..$ LL.p   : num 1.57
-##  $ :List of 5
-##   ..$ z      : num [1:100, 1:50] 0 0 1 0 0 0 0 0 0 0 ...
-##   .. ..- attr(*, "dimnames")=List of 2
-##   .. .. ..$ : chr [1:100] "1" "2" "3" "4" ...
-##   .. .. ..$ : chr [1:50] "1" "2" "3" "4" ...
-##   ..$ alpha.p: num [1:50] 0.206 0.206 0.206 0.206 0.206 ...
-##   ..$ a      : num [1:50] -0.174 -0.174 -0.174 -0.174 -0.174 ...
-##   ..$ eta    : num [1:100] 1.83 1.83 1.83 1.83 1.83 ...
-##   ..$ LL.p   : num -0.187
-## 
-## 
-## parameters:
-## 
-## psi.fs tau2 tau.lp alpha.p a mu.lp LL.p
+## Modeling a - 1 of 4 taxa

## module glm loaded
## Compiling model graph
 ##    Resolving undeclared variables
 ##    Allocating nodes
-## Graph information:
-##    Observed stochastic nodes: 6211
-##    Unobserved stochastic nodes: 5204
-##    Total graph size: 45069
+##    Graph Size: 64272
 ## 
 ## Initializing model
 ## 
 ## 
 ## ###
 ## Modeling b - 2 of 4 taxa
-## #### PLEASE REVIEW THE BELOW ####
-## 
-## Your model settings: sparta, contlistlength
-## 
-## Model File:
-## 
-## model{
-##  #######################################
-## # SPARTA model from GitHub 26/05/2016 #
-## 
-## # State model
-## for (i in 1:nsite){ 
-##   for (t in 1:nyear){   
-##     z[i,t] ~ dbern(muZ[i,t]) 
-##     logit(muZ[i,t])<- a[t] + eta[i] 
-##   }}  
-## 
-## # Priors 
-## # State model priors
-## for(t in 1:nyear){
-##   a[t] ~ dunif(-10,10)   
-## }                 
-## 
-## # RANDOM EFFECT for SITE
-## for (i in 1:nsite) {
-##   eta[i] ~ dnorm(0, tau2)       
-## } 
-## 
-## tau2 <- 1/(sigma2 * sigma2) 
-## sigma2 ~ dunif(0, 5)
-## 
-## 
-## # Observation model priors 
-## for (t in 1:nyear) {
-##   alpha.p[t] ~ dnorm(mu.lp, tau.lp)            
-## }
-## 
-## mu.lp ~ dnorm(0, 0.01)                         
-## tau.lp <- 1 / (sd.lp * sd.lp)                 
-## sd.lp ~ dunif(0, 5)   
-## 
-## # Derived parameters state model
-## 
-## # Finite sample occupancy
-## for (t in 1:nyear) {  
-##   psi.fs[t] <- sum(z[1:nsite,t])/nsite
-## }  LL.p ~ dunif(dtype2p_min, dtype2p_max)
-## ### Observation Model
-## for(j in 1:nvisit) {
-##   y[j] ~ dbern(Py[j])
-##   Py[j]<- z[Site[j],Year[j]]*p[j]
-##   logit(p[j]) <-  alpha.p[Year[j]] + LL.p*logL[j]
-## } }
-## 
-## bugs_data:
-## 
-## List of 9
-##  $ y          : num [1:6211] 0 0 0 0 0 0 0 0 0 0 ...
-##  $ Year       : num [1:6211] 3 7 10 11 12 12 13 14 16 19 ...
-##  $ Site       : int [1:6211] 1 1 1 1 1 1 1 1 1 1 ...
-##  $ nyear      : num 50
-##  $ nsite      : int 100
-##  $ nvisit     : int 6211
-##  $ logL       : num [1:6211] 0.693 0 0 0.693 0 ...
-##  $ dtype2p_min: num -10
-##  $ dtype2p_max: num 10
-## 
-## 
-## init.vals:
-## 
-## List of 3
-##  $ :List of 5
-##   ..$ z      : num [1:100, 1:50] 0 0 0 0 0 0 0 0 0 0 ...
-##   .. ..- attr(*, "dimnames")=List of 2
-##   .. .. ..$ : chr [1:100] "1" "2" "3" "4" ...
-##   .. .. ..$ : chr [1:50] "1" "2" "3" "4" ...
-##   ..$ alpha.p: num [1:50] -0.85 -0.85 -0.85 -0.85 -0.85 ...
-##   ..$ a      : num [1:50] 1.15 1.15 1.15 1.15 1.15 ...
-##   ..$ eta    : num [1:100] -0.364 -0.364 -0.364 -0.364 -0.364 ...
-##   ..$ LL.p   : num 1.53
-##  $ :List of 5
-##   ..$ z      : num [1:100, 1:50] 0 0 0 0 0 0 0 0 0 0 ...
-##   .. ..- attr(*, "dimnames")=List of 2
-##   .. .. ..$ : chr [1:100] "1" "2" "3" "4" ...
-##   .. .. ..$ : chr [1:50] "1" "2" "3" "4" ...
-##   ..$ alpha.p: num [1:50] 1.76 1.76 1.76 1.76 1.76 ...
-##   ..$ a      : num [1:50] -1.82 -1.82 -1.82 -1.82 -1.82 ...
-##   ..$ eta    : num [1:100] 0.112 0.112 0.112 0.112 0.112 ...
-##   ..$ LL.p   : num 1.57
-##  $ :List of 5
-##   ..$ z      : num [1:100, 1:50] 0 0 0 0 0 0 0 0 0 0 ...
-##   .. ..- attr(*, "dimnames")=List of 2
-##   .. .. ..$ : chr [1:100] "1" "2" "3" "4" ...
-##   .. .. ..$ : chr [1:50] "1" "2" "3" "4" ...
-##   ..$ alpha.p: num [1:50] 0.206 0.206 0.206 0.206 0.206 ...
-##   ..$ a      : num [1:50] -0.174 -0.174 -0.174 -0.174 -0.174 ...
-##   ..$ eta    : num [1:100] 1.83 1.83 1.83 1.83 1.83 ...
-##   ..$ LL.p   : num -0.187
-## 
-## 
-## parameters:
-## 
-## psi.fs tau2 tau.lp alpha.p a mu.lp LL.pCompiling model graph
+## Compiling model graph
 ##    Resolving undeclared variables
 ##    Allocating nodes
-## Graph information:
-##    Observed stochastic nodes: 6211
-##    Unobserved stochastic nodes: 5204
-##    Total graph size: 45069
+##    Graph Size: 64306
 ## 
 ## Initializing model
 ## 
 ## 
 ## ###
 ## Modeling c - 3 of 4 taxa
-## #### PLEASE REVIEW THE BELOW ####
-## 
-## Your model settings: sparta, contlistlength
-## 
-## Model File:
-## 
-## model{
-##  #######################################
-## # SPARTA model from GitHub 26/05/2016 #
-## 
-## # State model
-## for (i in 1:nsite){ 
-##   for (t in 1:nyear){   
-##     z[i,t] ~ dbern(muZ[i,t]) 
-##     logit(muZ[i,t])<- a[t] + eta[i] 
-##   }}  
-## 
-## # Priors 
-## # State model priors
-## for(t in 1:nyear){
-##   a[t] ~ dunif(-10,10)   
-## }                 
-## 
-## # RANDOM EFFECT for SITE
-## for (i in 1:nsite) {
-##   eta[i] ~ dnorm(0, tau2)       
-## } 
-## 
-## tau2 <- 1/(sigma2 * sigma2) 
-## sigma2 ~ dunif(0, 5)
-## 
-## 
-## # Observation model priors 
-## for (t in 1:nyear) {
-##   alpha.p[t] ~ dnorm(mu.lp, tau.lp)            
-## }
-## 
-## mu.lp ~ dnorm(0, 0.01)                         
-## tau.lp <- 1 / (sd.lp * sd.lp)                 
-## sd.lp ~ dunif(0, 5)   
-## 
-## # Derived parameters state model
-## 
-## # Finite sample occupancy
-## for (t in 1:nyear) {  
-##   psi.fs[t] <- sum(z[1:nsite,t])/nsite
-## }  LL.p ~ dunif(dtype2p_min, dtype2p_max)
-## ### Observation Model
-## for(j in 1:nvisit) {
-##   y[j] ~ dbern(Py[j])
-##   Py[j]<- z[Site[j],Year[j]]*p[j]
-##   logit(p[j]) <-  alpha.p[Year[j]] + LL.p*logL[j]
-## } }
-## 
-## bugs_data:
-## 
-## List of 9
-##  $ y          : num [1:6211] 0 0 0 0 0 0 0 0 0 0 ...
-##  $ Year       : num [1:6211] 3 7 10 11 12 12 13 14 16 19 ...
-##  $ Site       : int [1:6211] 1 1 1 1 1 1 1 1 1 1 ...
-##  $ nyear      : num 50
-##  $ nsite      : int 100
-##  $ nvisit     : int 6211
-##  $ logL       : num [1:6211] 0.693 0 0 0.693 0 ...
-##  $ dtype2p_min: num -10
-##  $ dtype2p_max: num 10
-## 
-## 
-## init.vals:
-## 
-## List of 3
-##  $ :List of 5
-##   ..$ z      : num [1:100, 1:50] 0 0 0 0 1 0 0 0 0 0 ...
-##   .. ..- attr(*, "dimnames")=List of 2
-##   .. .. ..$ : chr [1:100] "1" "2" "3" "4" ...
-##   .. .. ..$ : chr [1:50] "1" "2" "3" "4" ...
-##   ..$ alpha.p: num [1:50] -0.85 -0.85 -0.85 -0.85 -0.85 ...
-##   ..$ a      : num [1:50] 1.15 1.15 1.15 1.15 1.15 ...
-##   ..$ eta    : num [1:100] -0.364 -0.364 -0.364 -0.364 -0.364 ...
-##   ..$ LL.p   : num 1.53
-##  $ :List of 5
-##   ..$ z      : num [1:100, 1:50] 0 0 0 0 1 0 0 0 0 0 ...
-##   .. ..- attr(*, "dimnames")=List of 2
-##   .. .. ..$ : chr [1:100] "1" "2" "3" "4" ...
-##   .. .. ..$ : chr [1:50] "1" "2" "3" "4" ...
-##   ..$ alpha.p: num [1:50] 1.76 1.76 1.76 1.76 1.76 ...
-##   ..$ a      : num [1:50] -1.82 -1.82 -1.82 -1.82 -1.82 ...
-##   ..$ eta    : num [1:100] 0.112 0.112 0.112 0.112 0.112 ...
-##   ..$ LL.p   : num 1.57
-##  $ :List of 5
-##   ..$ z      : num [1:100, 1:50] 0 0 0 0 1 0 0 0 0 0 ...
-##   .. ..- attr(*, "dimnames")=List of 2
-##   .. .. ..$ : chr [1:100] "1" "2" "3" "4" ...
-##   .. .. ..$ : chr [1:50] "1" "2" "3" "4" ...
-##   ..$ alpha.p: num [1:50] 0.206 0.206 0.206 0.206 0.206 ...
-##   ..$ a      : num [1:50] -0.174 -0.174 -0.174 -0.174 -0.174 ...
-##   ..$ eta    : num [1:100] 1.83 1.83 1.83 1.83 1.83 ...
-##   ..$ LL.p   : num -0.187
-## 
-## 
-## parameters:
-## 
-## psi.fs tau2 tau.lp alpha.p a mu.lp LL.pCompiling model graph
+## Compiling model graph
 ##    Resolving undeclared variables
 ##    Allocating nodes
-## Graph information:
-##    Observed stochastic nodes: 6211
-##    Unobserved stochastic nodes: 5204
-##    Total graph size: 45069
+##    Graph Size: 64308
 ## 
 ## Initializing model
 ## 
 ## 
 ## ###
 ## Modeling d - 4 of 4 taxa
-## #### PLEASE REVIEW THE BELOW ####
-## 
-## Your model settings: sparta, contlistlength
-## 
-## Model File:
-## 
-## model{
-##  #######################################
-## # SPARTA model from GitHub 26/05/2016 #
-## 
-## # State model
-## for (i in 1:nsite){ 
-##   for (t in 1:nyear){   
-##     z[i,t] ~ dbern(muZ[i,t]) 
-##     logit(muZ[i,t])<- a[t] + eta[i] 
-##   }}  
-## 
-## # Priors 
-## # State model priors
-## for(t in 1:nyear){
-##   a[t] ~ dunif(-10,10)   
-## }                 
-## 
-## # RANDOM EFFECT for SITE
-## for (i in 1:nsite) {
-##   eta[i] ~ dnorm(0, tau2)       
-## } 
-## 
-## tau2 <- 1/(sigma2 * sigma2) 
-## sigma2 ~ dunif(0, 5)
-## 
-## 
-## # Observation model priors 
-## for (t in 1:nyear) {
-##   alpha.p[t] ~ dnorm(mu.lp, tau.lp)            
-## }
-## 
-## mu.lp ~ dnorm(0, 0.01)                         
-## tau.lp <- 1 / (sd.lp * sd.lp)                 
-## sd.lp ~ dunif(0, 5)   
-## 
-## # Derived parameters state model
-## 
-## # Finite sample occupancy
-## for (t in 1:nyear) {  
-##   psi.fs[t] <- sum(z[1:nsite,t])/nsite
-## }  LL.p ~ dunif(dtype2p_min, dtype2p_max)
-## ### Observation Model
-## for(j in 1:nvisit) {
-##   y[j] ~ dbern(Py[j])
-##   Py[j]<- z[Site[j],Year[j]]*p[j]
-##   logit(p[j]) <-  alpha.p[Year[j]] + LL.p*logL[j]
-## } }
-## 
-## bugs_data:
-## 
-## List of 9
-##  $ y          : num [1:6211] 0 0 0 0 0 0 0 0 0 0 ...
-##  $ Year       : num [1:6211] 3 7 10 11 12 12 13 14 16 19 ...
-##  $ Site       : int [1:6211] 1 1 1 1 1 1 1 1 1 1 ...
-##  $ nyear      : num 50
-##  $ nsite      : int 100
-##  $ nvisit     : int 6211
-##  $ logL       : num [1:6211] 0.693 0 0 0.693 0 ...
-##  $ dtype2p_min: num -10
-##  $ dtype2p_max: num 10
-## 
-## 
-## init.vals:
-## 
-## List of 3
-##  $ :List of 5
-##   ..$ z      : num [1:100, 1:50] 0 0 0 0 0 0 0 0 0 0 ...
-##   .. ..- attr(*, "dimnames")=List of 2
-##   .. .. ..$ : chr [1:100] "1" "2" "3" "4" ...
-##   .. .. ..$ : chr [1:50] "1" "2" "3" "4" ...
-##   ..$ alpha.p: num [1:50] -0.85 -0.85 -0.85 -0.85 -0.85 ...
-##   ..$ a      : num [1:50] 1.15 1.15 1.15 1.15 1.15 ...
-##   ..$ eta    : num [1:100] -0.364 -0.364 -0.364 -0.364 -0.364 ...
-##   ..$ LL.p   : num 1.53
-##  $ :List of 5
-##   ..$ z      : num [1:100, 1:50] 0 0 0 0 0 0 0 0 0 0 ...
-##   .. ..- attr(*, "dimnames")=List of 2
-##   .. .. ..$ : chr [1:100] "1" "2" "3" "4" ...
-##   .. .. ..$ : chr [1:50] "1" "2" "3" "4" ...
-##   ..$ alpha.p: num [1:50] 1.76 1.76 1.76 1.76 1.76 ...
-##   ..$ a      : num [1:50] -1.82 -1.82 -1.82 -1.82 -1.82 ...
-##   ..$ eta    : num [1:100] 0.112 0.112 0.112 0.112 0.112 ...
-##   ..$ LL.p   : num 1.57
-##  $ :List of 5
-##   ..$ z      : num [1:100, 1:50] 0 0 0 0 0 0 0 0 0 0 ...
-##   .. ..- attr(*, "dimnames")=List of 2
-##   .. .. ..$ : chr [1:100] "1" "2" "3" "4" ...
-##   .. .. ..$ : chr [1:50] "1" "2" "3" "4" ...
-##   ..$ alpha.p: num [1:50] 0.206 0.206 0.206 0.206 0.206 ...
-##   ..$ a      : num [1:50] -0.174 -0.174 -0.174 -0.174 -0.174 ...
-##   ..$ eta    : num [1:100] 1.83 1.83 1.83 1.83 1.83 ...
-##   ..$ LL.p   : num -0.187
-## 
-## 
-## parameters:
-## 
-## psi.fs tau2 tau.lp alpha.p a mu.lp LL.pCompiling model graph
+## Compiling model graph
 ##    Resolving undeclared variables
 ##    Allocating nodes
-## Graph information:
-##    Observed stochastic nodes: 6211
-##    Unobserved stochastic nodes: 5204
-##    Total graph size: 45069
+##    Graph Size: 64328
 ## 
 ## Initializing model
##    user  system elapsed 
-##   76.23    0.08   85.56
+## 70.20 0.08 70.53
# Lets look at the results
 ## The object returned is a list with one element for each species
-names(occ_out)
+names(occ_out)
## [1] "a" "b" "c" "d"
# Each of these is an object of class 'occDet'
-class(occ_out$a)
+class(occ_out$a)
## [1] "occDet"
# Inside these elements is the information of interest
-names(occ_out$a)
-
##  [1] "model"                "BUGSoutput"           "parameters.to.save"  
-##  [4] "model.file"           "n.iter"               "DIC"                 
-##  [7] "SPP_NAME"             "min_year"             "max_year"            
-## [10] "nsites"               "nvisits"              "species_sites"       
-## [13] "species_observations"
+names(occ_out$a) +
## [1] "model"              "BUGSoutput"         "parameters.to.save"
+## [4] "model.file"         "n.iter"             "DIC"               
+## [7] "SPP_NAME"           "min_year"           "max_year"
# Of particular interest to many users will be the summary
 # data in the BUGSoutput
-head(occ_out$a$BUGSoutput$summary)
-
##            mean       sd        2.5%       25%        50%        75%
-## LL.p  0.7427121 0.323516  0.04685566  0.530657  0.7499549  0.9781109
-## a[1]  2.4522909 3.692271 -2.55811307 -0.484016  1.2133769  5.3436343
-## a[2]  3.0376645 4.332380 -4.77258866 -1.402761  4.2795374  6.6267167
-## a[3] -0.6545089 3.033347 -4.55763097 -2.732324 -1.5474491  1.1263473
-## a[4] -2.4836398 1.784346 -6.02400397 -3.432000 -2.6641979 -1.3128673
-## a[5]  1.5554312 3.547313 -3.26809601 -1.060623  0.5196220  4.1324295
-##          97.5%     Rhat n.eff
-## LL.p 1.3318468 1.055220    37
-## a[1] 9.4722309 2.242779     5
-## a[2] 9.3896333 3.737906     3
-## a[3] 7.7935907 1.800707     6
-## a[4] 0.4916744 1.060894    55
-## a[5] 9.1627592 1.701485     6
+head(occ_out$a$BUGSoutput$summary) +
##                   mean         sd        2.5%          25%         50%
+## LL.p         0.2691009  0.3102804  -0.3519707   0.04895419   0.2556833
+## deviance   649.7160858 51.1319847 544.5524433 614.12892478 666.5255290
+## fit        281.9522389 89.1997805 190.9883913 219.13112637 247.9393552
+## fit.new    283.9149866 90.5760094 185.6004878 217.89008520 254.7165849
+## mean_early   0.3701971  0.1718789   0.1274588   0.25082920   0.3516627
+## mean_late    0.4114516  0.1241131   0.1808044   0.35083037   0.4000000
+##                    75%       97.5%     Rhat n.eff
+## LL.p         0.4929832   0.8780177 0.993288   190
+## deviance   688.2839296 710.7614446 1.031683    64
+## fit        316.7603165 499.9895976 1.009975   120
+## fit.new    319.0321333 504.5251365 1.006007   160
+## mean_early   0.4283236   0.7837351 1.841981     5
+## mean_late    0.4591644   0.7623305 1.295740    12
# We have included a plotting feature for objects of class
 # occDet which provides a useful visualisation of the trend
 # in occupancy over time
-plot(occ_out$a)
-

+plot(occ_out$a) +
+ +

He we have run a small example but in reality these models are usually run for many thousands of iterations, making the analysis of more than a handful of species impractical. For those with access to the necessary facilities it is possible to parallelise across species. To do this we use a pair of functions that are used internally by occDetModel. These are formatOccData which is used to format our occurrence data into the format needed by JAGS, and occDetFunc, the function which undertakes the modelling.

# First format our data
 formattedOccData <- formatOccData(taxa = myData$taxa,
@@ -1227,11 +782,11 @@ 

## Warning in errorChecks(taxa = taxa, site = site, time_period =
 ## time_period): 94 out of 8000 observations will be removed as duplicates
# This is a list of two elements
-names(formattedOccData)
+names(formattedOccData)

## [1] "spp_vis"    "occDetdata"

formatOccData returns a list of length 2; the first element ‘spp_vis’ is a data.frame with visit (unique combination of site and time period) in the first column and taxa for all the following columns. Values in taxa columns are either TRUE or FALSE depending on whether they were observed on that visit.

# Lets have a look at spp_vis
-head(formattedOccData$spp_vis[,1:5])
+head(formattedOccData$spp_vis[,1:5])
##            visit     a     b     c     d
 ## 1 A1001950-01-04 FALSE FALSE FALSE FALSE
 ## 2 A1001950-11-01  TRUE FALSE FALSE FALSE
@@ -1241,7 +796,7 @@ 

## 6 A1001953-02-22 FALSE FALSE FALSE FALSE

The second element (‘occDetData’) is a data frame giving the site, list length (the number of species observed on a visit) and year for each visit.

# Lets have a look at occDetData
-head(formattedOccData$occDetdata)
+head(formattedOccData$occDetdata)
##            visit site L year
 ## 1 A1001950-01-04 A100 2 1950
 ## 3 A1001950-11-01 A100 1 1950
@@ -1252,19 +807,23 @@ 

With our data in the correct format this can now go into the modelling function

# Use the occupancy modelling function to parrellise the process
 # Here we are going to use the package snowfall
-library(snowfall)
+library(snowfall)
 
 # I have 4 cpus on my PC so I set cpus to 4
 # when I initialise the cluster
-sfInit(parallel = TRUE, cpus = 4)
-
## snowfall 1.84-6.1 initialized (using snow 0.4-2): parallel execution on 4 CPUs.
+sfInit(parallel = TRUE, cpus = 4)

+
## Warning in searchCommandline(parallel, cpus = cpus, type = type,
+## socketHosts = socketHosts, : Unknown option on commandline:
+## rmarkdown::render('W:/PYWELL_SHARED/Pywell Projects/BRC/Tom August/R
+## Packages/Trend analyses/sparta/pre_vignette/sparta_vignette.Rmd', encoding
+
## snowfall 1.84-6 initialized (using snow 0.3-13): parallel execution on 4 CPUs.
# Export my data to the cluster
-sfExport('formattedOccData')
+sfExport('formattedOccData')
 
 # I create a function that takes a species name and runs my model
 occ_mod_function <- function(taxa_name){
   
-  library(sparta)
+  library(sparta)
   
   occ_out <- occDetFunc(taxa_name = taxa_name,
                         n_iterations = 200,
@@ -1276,39 +835,41 @@ 

} # I then run this in parallel -system.time({ -para_out <- sfClusterApplyLB(c('a','b','c','d'), occ_mod_function) +system.time({ +para_out <- sfClusterApplyLB(c('a','b','c','d'), occ_mod_function) })

##    user  system elapsed 
-##    0.01    0.00   40.04
+## 0.02 0.01 25.95
# Name each element of this output by the species
-for(i in  1:length(para_out)) names(para_out)[i] <- para_out[[i]]$SPP_NAM
+for(i in  1:length(para_out)) names(para_out)[i] <- para_out[[i]]$SPP_NAM
 
 # Stop the cluster
-sfStop()
+sfStop()
## 
 ## Stopping cluster
# This takes about half the time of the 
 # serial version we ran earlier, and the resulting object 
 # is the same (since we set the random seed to be the same
 # in each)
-head(para_out$a$BUGSoutput$summary)
-
##            mean       sd        2.5%       25%        50%        75%
-## LL.p  0.7427121 0.323516  0.04685566  0.530657  0.7499549  0.9781109
-## a[1]  2.4522909 3.692271 -2.55811307 -0.484016  1.2133769  5.3436343
-## a[2]  3.0376645 4.332380 -4.77258866 -1.402761  4.2795374  6.6267167
-## a[3] -0.6545089 3.033347 -4.55763097 -2.732324 -1.5474491  1.1263473
-## a[4] -2.4836398 1.784346 -6.02400397 -3.432000 -2.6641979 -1.3128673
-## a[5]  1.5554312 3.547313 -3.26809601 -1.060623  0.5196220  4.1324295
-##          97.5%     Rhat n.eff
-## LL.p 1.3318468 1.055220    37
-## a[1] 9.4722309 2.242779     5
-## a[2] 9.3896333 3.737906     3
-## a[3] 7.7935907 1.800707     6
-## a[4] 0.4916744 1.060894    55
-## a[5] 9.1627592 1.701485     6
-
plot(para_out$a)
-

+head(para_out$a$BUGSoutput$summary) +
##                   mean         sd        2.5%          25%         50%
+## LL.p         0.2691009  0.3102804  -0.3519707   0.04895419   0.2556833
+## deviance   649.7160858 51.1319847 544.5524433 614.12892478 666.5255290
+## fit        281.9522389 89.1997805 190.9883913 219.13112637 247.9393552
+## fit.new    283.9149866 90.5760094 185.6004878 217.89008520 254.7165849
+## mean_early   0.3708781  0.1715081   0.1307932   0.25000000   0.3466667
+## mean_late    0.4106272  0.1219015   0.1887431   0.35000000   0.3966667
+##                    75%       97.5%     Rhat n.eff
+## LL.p         0.4929832   0.8780177 0.993288   190
+## deviance   688.2839296 710.7614446 1.031683    64
+## fit        316.7603165 499.9895976 1.009975   120
+## fit.new    319.0321333 504.5251365 1.006007   160
+## mean_early   0.4349904   0.7779150 1.853765     5
+## mean_late    0.4533333   0.7547538 1.302426    11
+
plot(para_out$a)
+
+ +

This same approach can be used on cluster computers, which can have hundreds of processors, to dramatically reduce run times.

@@ -1316,7 +877,7 @@

Frescalo

The frescalo method is outlined in Hill (2012) and is a means to account for both spatial and temporal bias. This method was shown by Isaac et al (2014) to be a good method for data that is aggregated into time periods such as when comparing atlases. The frescalo method is run using a .exe, you will need to download this file by visiting this link - https://github.com/BiologicalRecordsCentre/frescalo. Once you have downloaded the .exe make a note of the directory you have placed it in, we will need that in a moment.

Again we will assume that your data is in a ‘what, where, when’ format similar to that we used in the previous method:

-
head(myData)
+
head(myData)
##   taxa site time_period tp
 ## 1    r  A51  1970-01-14  3
 ## 2    v  A87  1980-09-29  4
@@ -1326,8 +887,8 @@ 

## 6 x A48 1990-02-25 5

Frescalo’s requirements in terms of data structure and types is a little different to that we have seen in other functions. Firstly the entire data.frame is passed in as an argument called Data, and the column names of your various elements (taxa, site, etc) are given as other arguments. Secondly frescalo requires that the ‘when’ component is either a column of year or two columns, one of ‘start date’ and one of ‘end date’. Our data as presented above does not fit into this format so first we must reformat it. In our situation the simplest thing to do is to add a column giving the year. Since frescalo aggregates across time periods (often decades or greater) this loss of temporal resolution is not an issue.

# Add a year column
-myData$year <- as.numeric(format(myData$time_period, '%Y'))
-head(myData)
+myData$year <- as.numeric(format(myData$time_period, '%Y')) +head(myData)
##   taxa site time_period tp year
 ## 1    r  A51  1970-01-14  3 1970
 ## 2    v  A87  1980-09-29  4 1980
@@ -1338,23 +899,23 @@ 

Now we have our data in the correct format for frescalo there is one other major component we need, a weights file. You can find out more about the weights file and what it is used for in the original paper (Hill, 2012). In short the weights file outlines the similarity between sites in your dataset. This information is used to weight the analysis of each site accordingly. If you are undertaking this analysis in the UK at 10km square resolution there are some built in weights files you can use. Some of these weights files use the UK landcover map instead of floristic similarity (as used in Hill (2012)). You can find out more about these in the frescalo help file.

For the sake of demonstration let us assume that you do not have a weights file for your analysis, or that you want to create your own. To create a weights file you need two things, a measure of physical distance between your sites and a measure of similarity. In the original paper this similarity measure was floristic similarity, but it could also be habitat similarity or whatever is relevant for the taxa you are studying. In this example I have a table of distances and of land cover proportions at each site

# Here is the distance table
-head(myDistances)
+head(myDistances)

##     x   y     dist
 ## 1 A51 A51    0.000
-## 2 A87 A51 6017.644
-## 3 A56 A51 5155.147
-## 4 A28 A51 4031.708
-## 5 A77 A51 8803.663
-## 6 A48 A51 3647.278
+## 2 A87 A51 4074.258 +## 3 A56 A51 6595.711 +## 4 A28 A51 1531.943 +## 5 A77 A51 5732.942 +## 6 A48 A51 2394.873
# Here is our habitat data
-head(myHabitatData)
-
##   site  grassland  woodland   heathland      urban freshwater
-## 1  A51 0.13493868 0.2917507 0.150144845 0.10930222  0.3138636
-## 2  A87 0.26523781 0.2407631 0.175456197 0.10630542  0.2122375
-## 3  A56 0.24293108 0.3725655 0.007039022 0.11006062  0.2674037
-## 4  A28 0.22788880 0.2673961 0.251712762 0.02163138  0.2313710
-## 5  A77 0.07623146 0.1800027 0.261907694 0.31823613  0.1636220
-## 6  A48 0.52850290 0.1035430 0.007142256 0.05294400  0.3078678
+head(myHabitatData) +
##   site grassland  woodland  heathland      urban freshwater
+## 1  A51 0.1169123 0.1084992 0.28376157 0.37312774 0.11769919
+## 2  A87 0.1781151 0.1307214 0.35258119 0.26223604 0.07634632
+## 3  A56 0.2359391 0.1263644 0.25898930 0.13490734 0.24379991
+## 4  A28 0.3100922 0.1373896 0.20870313 0.28659095 0.05722412
+## 5  A77 0.2034073 0.4897063 0.05368464 0.01677132 0.23643036
+## 6  A48 0.2397599 0.1046128 0.34250853 0.13055663 0.18256221
# With our distance and habitat tables in hand we can
 # use the createWeights function to build our weights file
 # I have changed the defualts of dist_sub and sim_sub since
@@ -1377,14 +938,14 @@ 

## 90% ## 100% ## Complete

-
head(myWeights)
+
head(myWeights)
##   target neighbour weight
-## 1    A51       A13 0.0013
-## 2    A51       A36 0.4783
-## 3    A51       A40 0.0030
+## 1    A51        A2 0.0311
+## 2    A51       A47 0.1150
+## 3    A51       A49 0.0012
 ## 4    A51       A51 1.0000
-## 5    A51       A56 0.0019
-## 6    A51       A65 0.0274
+## 5 A51 A53 0.0160 +## 6 A51 A62 0.2687

The createWeights function follows the procedure outlined in Hill (2012) for creating weights and more information can be found in the help file of the function. With our data and weights file we are now ready to proceed with frescalo. As with other functions frescalo can take a range of additional arguments which you can see by entering ?frescalo at the console, here we will do a minimal example.

# First we need to enter the location where we placed the .exe
 # In my case I saved it to my documents folder
@@ -1392,8 +953,8 @@ 

# I then want to set up the time periods I want to analyse # Here I say I want to compare 1980-89 to 1990-99 -myTimePeriods <- data.frame(start = c(1980, 1990), end = c(1989, 1999)) -head(myTimePeriods)

+myTimePeriods <- data.frame(start = c(1980, 1990), end = c(1989, 1999)) +head(myTimePeriods)
##   start  end
 ## 1  1980 1989
 ## 2  1990 1999
@@ -1410,9 +971,6 @@

year = 'year', Fres_weights = myWeights, sinkdir = myFolder) -
## Warning in frescalo(Data = myData, frespath = myFrescaloPath, time_periods
-## = myTimePeriods, : sinkdir already contains frescalo output. New data saved
-## in ~/myFolder/frescalo_171221(2)
## 
 ## SAVING DATA TO FRESCALO WORKING DIRECTORY
 ## ********************
@@ -1422,43 +980,43 @@ 

## ********************

## Warning in run_fresc_file(fres_data = Data, output_dir = fresoutput,
 ## frescalo_path = frespath, : Your value of phi (0.74) is smaller than the
-## 98.5 percentile of input phi (0.88). It is reccommended your phi be similar
+## 98.5 percentile of input phi (0.89). It is reccommended your phi be similar
 ## to this value. For more information see Hill (2011) reference in frescalo
 ## help file
## Building Species List - Complete
 ## Outputting Species Results
-##  Species 1 of 26 - a - 21/12/2017 14:58:57
-##  Species 2 of 26 - b - 21/12/2017 14:58:57
-##  Species 3 of 26 - c - 21/12/2017 14:58:57
-##  Species 4 of 26 - d - 21/12/2017 14:58:57
-##  Species 5 of 26 - e - 21/12/2017 14:58:57
-##  Species 6 of 26 - f - 21/12/2017 14:58:57
-##  Species 7 of 26 - g - 21/12/2017 14:58:57
-##  Species 8 of 26 - h - 21/12/2017 14:58:57
-##  Species 9 of 26 - i - 21/12/2017 14:58:57
-##  Species 10 of 26 - j - 21/12/2017 14:58:57
-##  Species 11 of 26 - k - 21/12/2017 14:58:57
-##  Species 12 of 26 - l - 21/12/2017 14:58:57
-##  Species 13 of 26 - m - 21/12/2017 14:58:58
-##  Species 14 of 26 - n - 21/12/2017 14:58:58
-##  Species 15 of 26 - o - 21/12/2017 14:58:58
-##  Species 16 of 26 - p - 21/12/2017 14:58:58
-##  Species 17 of 26 - q - 21/12/2017 14:58:58
-##  Species 18 of 26 - r - 21/12/2017 14:58:58
-##  Species 19 of 26 - s - 21/12/2017 14:58:58
-##  Species 20 of 26 - t - 21/12/2017 14:58:58
-##  Species 21 of 26 - u - 21/12/2017 14:58:58
-##  Species 22 of 26 - v - 21/12/2017 14:58:58
-##  Species 23 of 26 - w - 21/12/2017 14:58:58
-##  Species 24 of 26 - x - 21/12/2017 14:58:58
-##  Species 25 of 26 - y - 21/12/2017 14:58:58
-##  Species 26 of 26 - z - 21/12/2017 14:58:58
+##  Species 1 of 26 - a - 10/07/2015 14:21:05
+##  Species 2 of 26 - b - 10/07/2015 14:21:05
+##  Species 3 of 26 - c - 10/07/2015 14:21:05
+##  Species 4 of 26 - d - 10/07/2015 14:21:05
+##  Species 5 of 26 - e - 10/07/2015 14:21:05
+##  Species 6 of 26 - f - 10/07/2015 14:21:05
+##  Species 7 of 26 - g - 10/07/2015 14:21:05
+##  Species 8 of 26 - h - 10/07/2015 14:21:05
+##  Species 9 of 26 - i - 10/07/2015 14:21:05
+##  Species 10 of 26 - j - 10/07/2015 14:21:05
+##  Species 11 of 26 - k - 10/07/2015 14:21:05
+##  Species 12 of 26 - l - 10/07/2015 14:21:05
+##  Species 13 of 26 - m - 10/07/2015 14:21:05
+##  Species 14 of 26 - n - 10/07/2015 14:21:05
+##  Species 15 of 26 - o - 10/07/2015 14:21:05
+##  Species 16 of 26 - p - 10/07/2015 14:21:05
+##  Species 17 of 26 - q - 10/07/2015 14:21:05
+##  Species 18 of 26 - r - 10/07/2015 14:21:05
+##  Species 19 of 26 - s - 10/07/2015 14:21:05
+##  Species 20 of 26 - t - 10/07/2015 14:21:05
+##  Species 21 of 26 - u - 10/07/2015 14:21:05
+##  Species 22 of 26 - v - 10/07/2015 14:21:05
+##  Species 23 of 26 - w - 10/07/2015 14:21:05
+##  Species 24 of 26 - x - 10/07/2015 14:21:05
+##  Species 25 of 26 - y - 10/07/2015 14:21:05
+##  Species 26 of 26 - z - 10/07/2015 14:21:05
 ## [1] "frescalo complete"

We get a warning from this analysis that our value of phi is too low. In this case this is because our simulated data suggests every species is found on every site in our time periods. This is a little unrealistic but should you get a similar warning with your data you might want to consult Hill (2012) and change your input value of phi.

The object that is returned (frescalo_results in my case) is an object of class frescalo. this means there are a couple of special methods we can use with it.

# Using 'summary' gives a quick overview of our data
 # This can be useful to double check that your data was read in correctly
-summary(frescalo_results)
+summary(frescalo_results)
##  Actual numbers in data 
 ##      Number of samples           100 
 ##      Number of species            26 
@@ -1468,55 +1026,55 @@ 

## Benchmark exclusions 0 ## Filter locations included 0

# Using 'print' we get a preview of the results
-print(frescalo_results)
+print(frescalo_results)
## 
 ## Preview of $paths - file paths to frescalo log, stats, freq, trend .csv files:
 ## 
-## [1] "~/myFolder/frescalo_171221(2)/Output/Log.txt"           
-## [2] "~/myFolder/frescalo_171221(2)/Output/Stats.csv"         
-## [3] "~/myFolder/frescalo_171221(2)/Output/Freq.csv"          
-## [4] "~/myFolder/frescalo_171221(2)/Output/Trend.csv"         
-## [5] "~/myFolder/frescalo_171221(2)/Output/Freq_quickload.txt"
+## [1] "~/myFolder/frescalo_150710/Output/Log.txt"           
+## [2] "~/myFolder/frescalo_150710/Output/Stats.csv"         
+## [3] "~/myFolder/frescalo_150710/Output/Freq.csv"          
+## [4] "~/myFolder/frescalo_150710/Output/Trend.csv"         
+## [5] "~/myFolder/frescalo_150710/Output/Freq_quickload.txt"
 ## 
 ## 
 ## Preview of $trend - trends file, giving the tfactor value for each species at each time period:
 ## 
 ##   Species   Time TFactor StDev  X Xspt Xest N.0.00 N.0.98
-## 1       a 1984.5   0.504 0.187  8    8    8     92      0
-## 2       a 1994.5   1.229 0.326 17   17   17     92      0
-## 3       j 1984.5   1.174 0.203 46   46   46    100      1
-## 4       j 1994.5   0.692 0.131 35   35   35    100      1
-## 5       k 1984.5   1.006 0.179 44   43   43    100      3
-## 6       k 1994.5   0.943 0.165 46   46   46    100      3
+## 1       a 1984.5   0.544 0.201  8    8    8     92      0
+## 2       a 1994.5   1.143 0.302 17   17   17     92      0
+## 3       j 1984.5   1.372 0.237 46   45   45    100      0
+## 4       j 1994.5   0.702 0.133 35   35   35    100      1
+## 5       k 1984.5   0.961 0.167 44   43   43    100      0
+## 6       k 1994.5   0.816 0.144 46   45   45    100      5
 ## 
 ## 
 ## Preview of $stat - statistics for each hectad in the analysis:
 ## 
 ##   Location Loc_no No_spp Phi_in Alpha Wgt_n2 Phi_out Spnum_in Spnum_out
-## 1       A1      1     11  0.664  1.44   3.62    0.74     13.2      15.6
-## 2      A10      2     13  0.722  1.10   3.72    0.74     14.7      15.3
-## 3     A100      3     18  0.805  0.73   2.98    0.74     17.4      15.4
-## 4      A11      4     16  0.712  1.15   3.41    0.74     15.3      16.2
-## 5      A12      5     11  0.703  1.20   3.67    0.74     14.5      15.7
-## 6      A13      6      8  0.661  1.67   2.86    0.74     11.5      14.3
+## 1       A1      1     11  0.815  0.66   1.58    0.74     11.5       9.8
+## 2      A10      2     13  0.717  1.14   3.77    0.74     14.9      15.7
+## 3     A100      3     18  0.828  0.58   3.01    0.74     18.1      14.9
+## 4      A11      4     16  0.847  0.49   1.81    0.74     16.9      13.3
+## 5      A12      5     11  0.718  1.15   3.22    0.74     14.5      15.3
+## 6      A13      6      8  0.681  1.32   3.88    0.74     14.5      16.4
 ##   Iter
-## 1    7
+## 1   15
 ## 2    5
-## 3   12
-## 4    7
-## 5    6
-## 6    3
+## 3    8
+## 4    9
+## 5    3
+## 6    8
 ## 
 ## 
 ## Preview of $freq - rescaled frequencies for each location and species:
 ## 
 ##   Location Species Pres   Freq  Freq1 SDFrq1 Rank Rank1
-## 1       A1       v    1 1.0000 1.0000 0.0000    1 0.064
-## 2       A1       r    1 0.9469 0.9856 0.0390    2 0.128
-## 3       A1       j    1 0.9168 0.9724 0.0593    3 0.192
-## 4       A1       d    1 0.9168 0.9724 0.0593    4 0.256
-## 5       A1       x    1 0.7588 0.8716 0.1647    5 0.320
-## 6       A1       w    1 0.7561 0.8696 0.1663    6 0.384
+## 1       A1       v    1 0.9778 0.9177 0.1372    1 0.102
+## 2       A1       k    1 0.9722 0.9046 0.1494    2 0.204
+## 3       A1       w    1 0.9634 0.8856 0.1659    3 0.305
+## 4       A1       y    1 0.9563 0.8715 0.1776    4 0.407
+## 5       A1       x    1 0.9412 0.8440 0.1992    5 0.509
+## 6       A1       e    1 0.8965 0.7740 0.2491    6 0.611
 ## 
 ## 
 ## Preview of $log - log file:
@@ -1529,7 +1087,7 @@ 

## Filter locations included 0 ## ## -## 98.5 percentile of input phi 0.88 +## 98.5 percentile of input phi 0.89 ## Target value of phi 0.74 ## ## @@ -1538,65 +1096,65 @@

## Preview of $lm_stats - trends in tfactor over time: ## ## SPECIES NAME b a b_std_err b_tval b_pval a_std_err -## 1 S1 a 0.0725 -143.37225 NA NA NA NA -## 12 S2 b 0.0132 -25.41440 NA NA NA NA -## 20 S3 c 0.0169 -32.58605 NA NA NA NA -## 21 S4 d 0.0433 -85.14685 NA NA NA NA -## 22 S5 e -0.0013 3.49585 NA NA NA NA -## 23 S6 f 0.0577 -113.85065 NA NA NA NA -## a_tval a_pval adj_r2 r2 F_val F_num_df F_den_df Ymin Ymax Z_VAL -## 1 NA NA NA 1 NA 1 0 1984.5 1994.5 1.929085 -## 12 NA NA NA 1 NA 1 0 1984.5 1994.5 0.416352 -## 20 NA NA NA 1 NA 1 0 1984.5 1994.5 0.490391 -## 21 NA NA NA 1 NA 1 0 1984.5 1994.5 1.388670 -## 22 NA NA NA 1 NA 1 0 1984.5 1994.5 -0.042350 -## 23 NA NA NA 1 NA 1 0 1984.5 1994.5 2.111953 -## SIG_95 -## 1 FALSE -## 12 FALSE -## 20 FALSE -## 21 FALSE -## 22 FALSE -## 23 TRUE

+## 1 S1 a 0.0599 -118.32755 NA NA NA NA +## 12 S2 b 0.0021 -3.27645 NA NA NA NA +## 20 S3 c 0.0045 -8.04525 NA NA NA NA +## 21 S4 d 0.0365 -71.60625 NA NA NA NA +## 22 S5 e -0.0046 9.96270 NA NA NA NA +## 23 S6 f 0.0326 -63.82470 NA NA NA NA +## a_tval a_pval adj_r2 r2 F_val F_num_df F_den_df Ymin Ymax +## 1 NA NA NA 1 NA 1 0 1984.5 1994.5 +## 12 NA NA NA 1 NA 1 0 1984.5 1994.5 +## 20 NA NA NA 1 NA 1 0 1984.5 1994.5 +## 21 NA NA NA 1 NA 1 0 1984.5 1994.5 +## 22 NA NA NA 1 NA 1 0 1984.5 1994.5 +## 23 NA NA NA 1 NA 1 0 1984.5 1994.5 +## Z_VAL SIG_95 +## 1 1.65116558 FALSE +## 12 0.06358704 FALSE +## 20 0.14903522 FALSE +## 21 1.17443512 FALSE +## 22 -0.17104272 FALSE +## 23 1.12652478 FALSE
# There is a lot of information here and you can read more about
 # what these data mean by looking at the frescalo help file
 # The files detailed in paths are also in the object returned
 frescalo_results$paths
-
## [1] "~/myFolder/frescalo_171221(2)/Output/Log.txt"           
-## [2] "~/myFolder/frescalo_171221(2)/Output/Stats.csv"         
-## [3] "~/myFolder/frescalo_171221(2)/Output/Freq.csv"          
-## [4] "~/myFolder/frescalo_171221(2)/Output/Trend.csv"         
-## [5] "~/myFolder/frescalo_171221(2)/Output/Freq_quickload.txt"
-
names(frescalo_results)
+
## [1] "~/myFolder/frescalo_150710/Output/Log.txt"           
+## [2] "~/myFolder/frescalo_150710/Output/Stats.csv"         
+## [3] "~/myFolder/frescalo_150710/Output/Freq.csv"          
+## [4] "~/myFolder/frescalo_150710/Output/Trend.csv"         
+## [5] "~/myFolder/frescalo_150710/Output/Freq_quickload.txt"
+
names(frescalo_results)
## [1] "paths"    "trend"    "stat"     "freq"     "log"      "lm_stats"
# However we additionally get some model results in our returned object
 # under '$lm_stats'

The results from frescalo may seem complex at first and I suggest reading the Value section of the frescalo help file for details. In brief: frescalo_results$paths lists the file paths of the raw data files for $log, $stat, $freq and $trend, in that order. frescalo_results$trend is a data.frame providing the list of time factors (a measure of probability of occurrence relative to benchmark species) for each species-timeperiod. frescalo_results$stat is a data.frame giving details about sites such as estimated species richness. frescalo_results$freq is a data.frame of the species frequencies, that is the probabilities that a species was present at a certain location. frescalo_results$log, a simple report of the console output from the .exe. frescalo_results$lm_stats is a data.frame giving the results of a linear regression of Tfactors for each species when more than two time periods are used. If only 2 time periods are used (as in our example) the linear modeling section of this data.frame is filled with NAs and a z-test is performed instead (results are given in the last columns).

# Lets look at some results fo the first three species
-frescalo_results$lm_stats[1:3, c('NAME','Z_VAL','SIG_95')]
-
##    NAME    Z_VAL SIG_95
-## 1     a 1.929085  FALSE
-## 12    b 0.416352  FALSE
-## 20    c 0.490391  FALSE
+frescalo_results$lm_stats[1:3, c('NAME','Z_VAL','SIG_95')] +
##    NAME      Z_VAL SIG_95
+## 1     a 1.65116558  FALSE
+## 12    b 0.06358704  FALSE
+## 20    c 0.14903522  FALSE
# None of these have a significant change using a z-test
 # Lets look at the raw data
-frescalo_results$trend[frescalo_results$trend$Species %in% c('a', 'b', 'c'),
-                       c('Species', 'Time', 'TFactor', 'StDev')]
+frescalo_results$trend[frescalo_results$trend$Species %in% c('a', 'b', 'c'), + c('Species', 'Time', 'TFactor', 'StDev')]
##    Species   Time TFactor StDev
-## 1        a 1984.5   0.504 0.187
-## 2        a 1994.5   1.229 0.326
-## 23       b 1984.5   0.781 0.215
-## 24       b 1994.5   0.913 0.233
-## 39       c 1984.5   0.952 0.234
-## 40       c 1994.5   1.121 0.253
+## 1 a 1984.5 0.544 0.201 +## 2 a 1994.5 1.143 0.302 +## 23 b 1984.5 0.891 0.237 +## 24 b 1994.5 0.912 0.230 +## 39 c 1984.5 0.885 0.215 +## 40 c 1994.5 0.930 0.212
# We can see from these results that the big standard deviations on 
 # the tfactor values means there is no real difference between the 
 # two time periods

If your data are from the UK and sites are given as grid referenes that there is functionality to plot a simple output of your results

# This only works with UK grid references
 # We can load an example dataset from the UK
-data(unicorns)
-head(unicorns)
+data(unicorns) +head(unicorns)
##          TO_STARTDATE                Date hectad   kmsq    CONCEPT
 ## 1 1968-08-06 00:00:00 1968-08-06 00:00:00   NZ28   <NA> Species 18
 ## 2 1951-05-12 00:00:00 1951-05-12 00:00:00   SO34   <NA> Species 11
@@ -1618,7 +1176,7 @@ 

## your weights file, these will be ignored

## Warning in frescalo(Data = unicorns, frespath = myFrescaloPath,
 ## time_periods = myTimePeriods, : sinkdir already contains frescalo output.
-## New data saved in ~/myFolder/frescalo_171221(3)
+## New data saved in ~/myFolder/frescalo_150710(2)
## 
 ## SAVING DATA TO FRESCALO WORKING DIRECTORY
 ## ********************
@@ -1629,67 +1187,69 @@ 

## ## Building Species List - Complete ## Outputting Species Results -## Species 1 of 55 - Species 1 - 21/12/2017 14:59:23 -## Species 2 of 55 - Species 10 - 21/12/2017 14:59:23 -## Species 3 of 55 - Species 11 - 21/12/2017 14:59:23 -## Species 4 of 55 - Species 12 - 21/12/2017 14:59:23 -## Species 5 of 55 - Species 13 - 21/12/2017 14:59:23 -## Species 6 of 55 - Species 14 - 21/12/2017 14:59:23 -## Species 7 of 55 - Species 15 - 21/12/2017 14:59:23 -## Species 8 of 55 - Species 16 - 21/12/2017 14:59:23 -## Species 9 of 55 - Species 17 - 21/12/2017 14:59:23 -## Species 10 of 55 - Species 18 - 21/12/2017 14:59:23 -## Species 11 of 55 - Species 19 - 21/12/2017 14:59:23 -## Species 12 of 55 - Species 2 - 21/12/2017 14:59:23 -## Species 13 of 55 - Species 20 - 21/12/2017 14:59:23 -## Species 14 of 55 - Species 21 - 21/12/2017 14:59:23 -## Species 15 of 55 - Species 22 - 21/12/2017 14:59:23 -## Species 16 of 55 - Species 23 - 21/12/2017 14:59:23 -## Species 17 of 55 - Species 24 - 21/12/2017 14:59:23 -## Species 18 of 55 - Species 25 - 21/12/2017 14:59:23 -## Species 19 of 55 - Species 27 - 21/12/2017 14:59:23 -## Species 20 of 55 - Species 28 - 21/12/2017 14:59:23 -## Species 21 of 55 - Species 29 - 21/12/2017 14:59:23 -## Species 22 of 55 - Species 3 - 21/12/2017 14:59:23 -## Species 23 of 55 - Species 30 - 21/12/2017 14:59:23 -## Species 24 of 55 - Species 31 - 21/12/2017 14:59:24 -## Species 25 of 55 - Species 32 - 21/12/2017 14:59:24 -## Species 26 of 55 - Species 33 - 21/12/2017 14:59:24 -## Species 27 of 55 - Species 34 - 21/12/2017 14:59:24 -## Species 28 of 55 - Species 35 - 21/12/2017 14:59:24 -## Species 29 of 55 - Species 36 - 21/12/2017 14:59:24 -## Species 30 of 55 - Species 37 - 21/12/2017 14:59:24 -## Species 31 of 55 - Species 38 - 21/12/2017 14:59:24 -## Species 32 of 55 - Species 39 - 21/12/2017 14:59:24 -## Species 33 of 55 - Species 4 - 21/12/2017 14:59:24 -## Species 34 of 55 - Species 40 - 21/12/2017 14:59:24 -## Species 35 of 55 - Species 41 - 21/12/2017 14:59:24 -## Species 36 of 55 - Species 42 - 21/12/2017 14:59:24 -## Species 37 of 55 - Species 44 - 21/12/2017 14:59:24 -## Species 38 of 55 - Species 45 - 21/12/2017 14:59:24 -## Species 39 of 55 - Species 46 - 21/12/2017 14:59:24 -## Species 40 of 55 - Species 47 - 21/12/2017 14:59:24 -## Species 41 of 55 - Species 48 - 21/12/2017 14:59:24 -## Species 42 of 55 - Species 49 - 21/12/2017 14:59:24 -## Species 43 of 55 - Species 5 - 21/12/2017 14:59:24 -## Species 44 of 55 - Species 50 - 21/12/2017 14:59:24 -## Species 45 of 55 - Species 51 - 21/12/2017 14:59:24 -## Species 46 of 55 - Species 52 - 21/12/2017 14:59:24 -## Species 47 of 55 - Species 54 - 21/12/2017 14:59:24 -## Species 48 of 55 - Species 55 - 21/12/2017 14:59:24 -## Species 49 of 55 - Species 56 - 21/12/2017 14:59:24 -## Species 50 of 55 - Species 57 - 21/12/2017 14:59:24 -## Species 51 of 55 - Species 6 - 21/12/2017 14:59:24 -## Species 52 of 55 - Species 66 - 21/12/2017 14:59:24 -## Species 53 of 55 - Species 7 - 21/12/2017 14:59:24 -## Species 54 of 55 - Species 8 - 21/12/2017 14:59:24 -## Species 55 of 55 - Species 9 - 21/12/2017 14:59:24 +## Species 1 of 55 - Species 1 - 10/07/2015 15:40:46 +## Species 2 of 55 - Species 10 - 10/07/2015 15:40:46 +## Species 3 of 55 - Species 11 - 10/07/2015 15:40:46 +## Species 4 of 55 - Species 12 - 10/07/2015 15:40:46 +## Species 5 of 55 - Species 13 - 10/07/2015 15:40:46 +## Species 6 of 55 - Species 14 - 10/07/2015 15:40:46 +## Species 7 of 55 - Species 15 - 10/07/2015 15:40:46 +## Species 8 of 55 - Species 16 - 10/07/2015 15:40:46 +## Species 9 of 55 - Species 17 - 10/07/2015 15:40:46 +## Species 10 of 55 - Species 18 - 10/07/2015 15:40:46 +## Species 11 of 55 - Species 19 - 10/07/2015 15:40:46 +## Species 12 of 55 - Species 2 - 10/07/2015 15:40:46 +## Species 13 of 55 - Species 20 - 10/07/2015 15:40:46 +## Species 14 of 55 - Species 21 - 10/07/2015 15:40:46 +## Species 15 of 55 - Species 22 - 10/07/2015 15:40:46 +## Species 16 of 55 - Species 23 - 10/07/2015 15:40:46 +## Species 17 of 55 - Species 24 - 10/07/2015 15:40:46 +## Species 18 of 55 - Species 25 - 10/07/2015 15:40:46 +## Species 19 of 55 - Species 27 - 10/07/2015 15:40:46 +## Species 20 of 55 - Species 28 - 10/07/2015 15:40:46 +## Species 21 of 55 - Species 29 - 10/07/2015 15:40:46 +## Species 22 of 55 - Species 3 - 10/07/2015 15:40:46 +## Species 23 of 55 - Species 30 - 10/07/2015 15:40:46 +## Species 24 of 55 - Species 31 - 10/07/2015 15:40:46 +## Species 25 of 55 - Species 32 - 10/07/2015 15:40:46 +## Species 26 of 55 - Species 33 - 10/07/2015 15:40:46 +## Species 27 of 55 - Species 34 - 10/07/2015 15:40:46 +## Species 28 of 55 - Species 35 - 10/07/2015 15:40:46 +## Species 29 of 55 - Species 36 - 10/07/2015 15:40:46 +## Species 30 of 55 - Species 37 - 10/07/2015 15:40:46 +## Species 31 of 55 - Species 38 - 10/07/2015 15:40:46 +## Species 32 of 55 - Species 39 - 10/07/2015 15:40:46 +## Species 33 of 55 - Species 4 - 10/07/2015 15:40:46 +## Species 34 of 55 - Species 40 - 10/07/2015 15:40:46 +## Species 35 of 55 - Species 41 - 10/07/2015 15:40:46 +## Species 36 of 55 - Species 42 - 10/07/2015 15:40:46 +## Species 37 of 55 - Species 44 - 10/07/2015 15:40:46 +## Species 38 of 55 - Species 45 - 10/07/2015 15:40:46 +## Species 39 of 55 - Species 46 - 10/07/2015 15:40:46 +## Species 40 of 55 - Species 47 - 10/07/2015 15:40:46 +## Species 41 of 55 - Species 48 - 10/07/2015 15:40:46 +## Species 42 of 55 - Species 49 - 10/07/2015 15:40:46 +## Species 43 of 55 - Species 5 - 10/07/2015 15:40:46 +## Species 44 of 55 - Species 50 - 10/07/2015 15:40:46 +## Species 45 of 55 - Species 51 - 10/07/2015 15:40:46 +## Species 46 of 55 - Species 52 - 10/07/2015 15:40:46 +## Species 47 of 55 - Species 54 - 10/07/2015 15:40:46 +## Species 48 of 55 - Species 55 - 10/07/2015 15:40:46 +## Species 49 of 55 - Species 56 - 10/07/2015 15:40:46 +## Species 50 of 55 - Species 57 - 10/07/2015 15:40:46 +## Species 51 of 55 - Species 6 - 10/07/2015 15:40:46 +## Species 52 of 55 - Species 66 - 10/07/2015 15:40:46 +## Species 53 of 55 - Species 7 - 10/07/2015 15:40:46 +## Species 54 of 55 - Species 8 - 10/07/2015 15:40:46 +## Species 55 of 55 - Species 9 - 10/07/2015 15:40:46 ## [1] "frescalo complete"

It is worth noting the console output here. We get a warning telling us that I have some data from a site that is not in my weights file, so I might want to investigate that and add the site to my weights file. We will ignore it for now. The second warning tells us that the sinkdir that we gave already has frescalo output in it. The function has got around this by renaming the output. We finally got a long list of all the species as their data were compiled internally.

Now for the plotting.

-
plot(unicorn_results)
-

-

Each panel of the plot gives different information for your results. The top left plot shows the observed number of species at each site (given in unicorn_results$stat$No_spp), this can be contrasted with the top right plot which gives the estimated number of species after accounting for recording effort (given in unicorn_results$stat$Spnum_out). Recording effort is presented in the bottom left panel - low values of alpha (white) show areas of high recording effort (given in unicorn_results$stat$Alpha), and a summary of the species trends are given in the bottom right (given in unicorn_results$lm_stats). In this case there is a skew towards species increasing, however some of these may be non-significant, this could be explored in more detail be referring to unicorn_results$lm_stats.

+
plot(unicorn_results)
+
+ +
+

Each panel of the plot gives different information for your results. The top right plot shows the observed number of species at each site (given in unicorn_results$stat$No_spp), this can be contrasted with the top left plot which gives the estimated number of species after accounting for recording effort (given in unicorn_results$stat$Spnum_out). Recording effort is presented in the bottom left panel - low values of alpha (white) show areas of high recording effort (given in unicorn_results$stat$Alpha), and a summary of the species trends are given in the bottom right (given in unicorn_results$lm_stats). In this case there is a skew towards species increasing, however some of these may be non-significant, this could be explored in more detail be referring to unicorn_results$lm_stats.

@@ -1735,15 +1294,16 @@

-

Site built with pkgdown.

+

Site built with pkgdown 1.3.0.

-
+ + diff --git a/docs/articles/sparta_vignette.md b/docs/articles/sparta_vignette.md deleted file mode 100644 index e57793a..0000000 --- a/docs/articles/sparta_vignette.md +++ /dev/null @@ -1,1754 +0,0 @@ -# sparta - Species Presence Absence R Trends Analyses -Tom August -June 2015 - - - -# Introduction - -Sparta provides a range of tools for analysing trends in species occurrence data and is based on the work presented in [Isaac et al (2014)](http://onlinelibrary.wiley.com/doi/10.1111/2041-210X.12254/abstract). The data that is used in these method is 'what where and when'. The 'what' is typically a species name. 'Where' is the location of the observation, sometimes referred to as the site. This is typically a 1km, 2km or 10km grid square but could also be a none regular location such as field sites or counties. 'When' is the time when an observation is made, and the requirements differ between methods. Some methods require a date while others require you to aggregate dates into time periods for comparison. - -All of the methods described here require multi species data. This is because they use information across all species to assess biases. - -In this vignette we will run through the methods and show how they can be used in reproducible examples. - -## Installation - -Installing the package is easy and can be done from CRAN. Alternatively the development version can be installed from GitHub. - -NOTE: JAGS must be installed before the R package installation will work. JAGS can be found here - http://sourceforge.net/projects/mcmc-jags/files/JAGS/ - - -```r -# Install the package from CRAN -# THIS WILL WORK ONLY AFTER THE PACKAGE IS PUBLISHED -install.packages('sparta') - -# Or install the development version from GitHub -library(devtools) -install_github('biologicalrecordscentre/sparta') -``` - - -```r -# Once installed, load the package -library(sparta) -``` - -``` -## Loading required package: lme4 -## Loading required package: Matrix -## Loading required package: Rcpp -``` - -The functions in sparta cover a range of tasks. Primarily they are focused on analysing trends in species occurrence data while accounting for biases (see Isaac et al, 2014). In this vignette we step through these functions and others so that you can understand how the package works. If you have any questions you can find the package maintainers email address using `maintainer('sparta')`, and if you have issues or bugs you can [report them here](https://github.com/biologicalrecordscentre/sparta/issues) - -\pagebreak - -# Modelling methods - -## Create some example data - -Clearly when you are using sparta you will want to use your own data, however perhaps you are only at the planning stage of your project? This code shows you how to create some example data so that you can try out sparta's functionality. - - -```r -# Create data -n <- 8000 # size of dataset -nyr <- 50 # number of years in data -nSamples <- 200 # set number of dates -nSites <- 100 # set number of sites -set.seed(125) # set a random seed - -# Create somes dates -first <- as.Date(strptime("1950/01/01", "%Y/%m/%d")) -last <- as.Date(strptime(paste(1950+(nyr-1),"/12/31", sep=''), "%Y/%m/%d")) -dt <- last-first -rDates <- first + (runif(nSamples)*dt) - -# taxa are set semi-randomly -taxa_probabilities <- seq(from = 0.1, to = 0.7, length.out = 26) -taxa <- sample(letters, size = n, TRUE, prob = taxa_probabilities) - -# sites are visited semi-randomly -site_probabilities <- seq(from = 0.1, to = 0.7, length.out = nSites) -site <- sample(paste('A', 1:nSites, sep=''), size = n, TRUE, prob = site_probabilities) - -# the date of visit is selected semi-randomly from those created earlier -time_probabilities <- seq(from = 0.1, to = 0.7, length.out = nSamples) -time_period <- sample(rDates, size = n, TRUE, prob = time_probabilities) - -myData <- data.frame(taxa, site, time_period) - -# Let's have a look at the my example data -head(myData) -``` - -``` -## taxa site time_period -## 1 r A51 1970-01-14 -## 2 v A87 1980-09-29 -## 3 e A56 1996-04-14 -## 4 z A28 1959-01-16 -## 5 r A77 1970-09-21 -## 6 x A48 1990-02-25 -``` - -In general this is the format of data you will need for all of the functions in sparta. The taxa and site columns should be characters and the time_period column should ideally be a date but can in some cases be a numeric. - -There are many sources of wildlife observation data including GBIF (Global Biodiversity Information Facility) and the NBN gateway (National Biodiversity Network). Both of these repositories have R packages that will allow you to download this type of data straight into your R session (see [rgbif](http://cran.r-project.org/web/packages/rgbif/index.html) and [rnbn](http://cran.r-project.org/web/packages/rnbn/index.html) for details) - -## Assessing the quality of data - -It can be useful to have a look at your data before you do any analyses. For example it is important to understand the biases in your data. The function `dataDiagnostics` is designed to help with this. - - -```r -# Run some data diagnostics on our data -results <- dataDiagnostics(taxa = myData$taxa, - site = myData$site, - time_period = myData$time_period, - progress_bar = FALSE) -``` - -``` -## Warning in errorChecks(taxa = taxa, site = site, time_period = -## time_period): 94 out of 8000 observations will be removed as duplicates -``` - -![](unnamed-chunk-4-1.png) - -``` -## ## Linear model outputs ## -## -## There is no detectable change in the number of records over time: -## -## Estimate Std. Error t value Pr(>|t|) -## (Intercept) -894.8997359 1710.0719088 -0.5233112 0.6031654 -## time_period 0.5342617 0.8660553 0.6168910 0.5402219 -## -## -## There is no detectable change in list lengths over time: -## -## Estimate Std. Error z value Pr(>|z|) -## (Intercept) 2.390402e-01 1.208657e-02 19.7773477 4.665954e-87 -## time_period 1.098369e-06 2.135956e-06 0.5142282 6.070924e-01 -``` - -The plot produced shows the number of records for each year in the top plot and the average list length in a box plot at the bottom. List length is the number of taxa observed on a visit to a site, where a visit is taken to be a unique combination of 'where' and 'when'. A trend in the number of observations across time is not uncommon and a formal test for such a trend is performed in the form of a linear model. Trends in the number of records over time are handled by all of the methods presented in sparta in a variety of different ways. Trends in list length are tested in the same manner, and both are returned to the console. A in list length can cause some methods such as the reporting rate methods to fail (see 'LessEffortPerVisit' scenario in [Isaac et al (2014)](http://onlinelibrary.wiley.com/doi/10.1111/2041-210X.12254/abstract)) -Unsurprisingly, since this is a random dataset, we have no trend in either the number of records or list length over time. This function also works if we have a numeric for time period such as the year - - -```r -# Run some data diagnostics on our data, now time_period -# is set to be a year -results <- dataDiagnostics(taxa = myData$taxa, - site = myData$site, - time_period = as.numeric(format(myData$time_period, '%Y')), - progress_bar = FALSE) -``` - -``` -## Warning in errorChecks(taxa = taxa, site = site, time_period = -## time_period): 419 out of 8000 observations will be removed as duplicates -``` - -![](unnamed-chunk-5-1.png) - -``` -## ## Linear model outputs ## -## -## There is no detectable change in the number of records over time: -## -## Estimate Std. Error t value Pr(>|t|) -## (Intercept) -894.8997359 1710.0719088 -0.5233112 0.6031654 -## time_period 0.5342617 0.8660553 0.6168910 0.5402219 -## -## -## There is no detectable change in list lengths over time: -## -## Estimate Std. Error z value Pr(>|z|) -## (Intercept) -0.6465523185 1.5554513917 -0.4156686 0.6776525 -## time_period 0.0007201245 0.0007874907 0.9144546 0.3604780 -``` - -If we want to view these results in more detail we can interrogate the object `results` - - -```r -# See what is in results.. -names(results) -``` - -``` -## [1] "RecordsPerYear" "VisitListLength" "modelRecs" "modelList" -``` - -```r -# Let's have a look at the details -head(results$RecordsPerYear) -``` - -``` -## RecordsPerYear -## 1950 1951 1952 1953 1954 1955 -## 224 69 147 181 119 218 -``` - -```r -head(results$VisitListLength) -``` - -``` -## time_period site listLength -## 1 1950 A100 3 -## 2 1950 A11 1 -## 3 1950 A12 2 -## 4 1950 A13 1 -## 5 1950 A15 1 -## 6 1950 A16 2 -``` - -```r -summary(results$modelRecs) -``` - -``` -## -## Call: -## glm(formula = count ~ time_period, data = mData) -## -## Deviance Residuals: -## Min 1Q Median 3Q Max -## -136.06 -59.03 -22.40 50.51 265.99 -## -## Coefficients: -## Estimate Std. Error t value Pr(>|t|) -## (Intercept) -894.8997 1710.0719 -0.523 0.603 -## time_period 0.5343 0.8661 0.617 0.540 -## -## (Dispersion parameter for gaussian family taken to be 7809.915) -## -## Null deviance: 377848 on 49 degrees of freedom -## Residual deviance: 374876 on 48 degrees of freedom -## AIC: 594.01 -## -## Number of Fisher Scoring iterations: 2 -``` - -```r -summary(results$modelList) -``` - -``` -## -## Call: -## glm(formula = listLength ~ time_period, family = "poisson", data = space_time) -## -## Deviance Residuals: -## Min 1Q Median 3Q Max -## -0.9132 -0.8866 -0.1309 0.5260 3.8475 -## -## Coefficients: -## Estimate Std. Error z value Pr(>|z|) -## (Intercept) -0.6465523 1.5554514 -0.416 0.678 -## time_period 0.0007201 0.0007875 0.914 0.360 -## -## (Dispersion parameter for poisson family taken to be 1) -## -## Null deviance: 2737.1 on 3489 degrees of freedom -## Residual deviance: 2736.3 on 3488 degrees of freedom -## AIC: 11607 -## -## Number of Fisher Scoring iterations: 5 -``` - - -## Telfer - -Telfer's change index is designed to assess the relative change in range size of species between two time periods ([Telfer et al, 2002](http://www.sciencedirect.com/science/article/pii/S0006320702000502#)). This is a simple method that is robust but has low power to detect trends where they exist. While this method is designed to compare two time period sparta can take many time periods and will complete all pairwise comparisons. - -Our data is not quite in the correct format for Telfer since it is used to compare time periods but our `time_period` column is a date. We can fix this by using the `date2timeperiod` function. - - -```r -## Create a new column for the time period -# First define my time periods -time_periods <- data.frame(start = c(1950, 1960, 1970, 1980, 1990), - end = c(1959, 1969, 1979, 1989, 1999)) - -time_periods -``` - -``` -## start end -## 1 1950 1959 -## 2 1960 1969 -## 3 1970 1979 -## 4 1980 1989 -## 5 1990 1999 -``` - -```r -# Now use these to assign my dates to time periods -myData$tp <- date2timeperiod(myData$time_period, time_periods) - -head(myData) -``` - -``` -## taxa site time_period tp -## 1 r A51 1970-01-14 3 -## 2 v A87 1980-09-29 4 -## 3 e A56 1996-04-14 5 -## 4 z A28 1959-01-16 1 -## 5 r A77 1970-09-21 3 -## 6 x A48 1990-02-25 5 -``` - -As you can see our new column indicates which time period each date falls into with 1 being the earliest time period, 2 being the second and so on. This function will also work if instead of a single date for each record you have a date range - - -```r -## Create a dataset where we have date ranges -Date_range <- data.frame(startdate = myData$time_period, - enddate = (myData$time_period + 600)) - -head(Date_range) -``` - -``` -## startdate enddate -## 1 1970-01-14 1971-09-06 -## 2 1980-09-29 1982-05-22 -## 3 1996-04-14 1997-12-05 -## 4 1959-01-16 1960-09-07 -## 5 1970-09-21 1972-05-13 -## 6 1990-02-25 1991-10-18 -``` - -```r -# Now assign my date ranges to time periods -Date_range$time_period <- date2timeperiod(Date_range, time_periods) - -head(Date_range) -``` - -``` -## startdate enddate time_period -## 1 1970-01-14 1971-09-06 3 -## 2 1980-09-29 1982-05-22 4 -## 3 1996-04-14 1997-12-05 5 -## 4 1959-01-16 1960-09-07 NA -## 5 1970-09-21 1972-05-13 3 -## 6 1990-02-25 1991-10-18 5 -``` - -As you can see in this example when a date range spans the boundaries of your time periods NA is returned. - -Now we have our data in the right format we can use the `telfer` function to analyse the data. The Telfer index for each species is the standardized residual from a linear regression across all species and is a measure of relative change only as the average real trend across species is obscured ([Isaac et al (2014)](http://onlinelibrary.wiley.com/doi/10.1111/2041-210X.12254/abstract); [Telfer et al, 2002](http://www.sciencedirect.com/science/article/pii/S0006320702000502#)).Telfer is used for comparing two time periods and if you have more than this the `telfer` function will all pair-wise comparisons. - - -```r -# Here is our data -head(myData) -``` - -``` -## taxa site time_period tp -## 1 r A51 1970-01-14 3 -## 2 v A87 1980-09-29 4 -## 3 e A56 1996-04-14 5 -## 4 z A28 1959-01-16 1 -## 5 r A77 1970-09-21 3 -## 6 x A48 1990-02-25 5 -``` - -```r -telfer_results <- telfer(taxa = myData$taxa, - site = myData$site, - time_period = myData$tp, - minSite = 2) -``` - -``` -## Warning in errorChecks(taxa = taxa, site = site, time_period = -## time_period, : 2541 out of 8000 observations will be removed as duplicates -``` - -We get a warning message indicating that a large number of rows are being removed as duplicates. This occurs since we are now aggregating records into time periods and therefore creating a large number of duplicates. - -The results give the change index for each species (rows) in each of the pairwise comparisons of time periods (columns). - - -```r -head(telfer_results) -``` - -``` -## taxa Telfer_1_2 Telfer_1_3 Telfer_1_4 Telfer_1_5 Telfer_2_3 -## 1 a -0.67842545 -1.744577671 -1.8073843 -0.7000801 -1.8352888 -## 2 b -0.90368128 -0.841219630 -0.8697828 -1.5449132 -0.5139840 -## 3 c 0.96096754 -0.008737329 0.2181534 0.3726534 -0.7254485 -## 4 d 0.79744179 -0.558165922 0.3848417 1.6642357 -1.1759409 -## 5 e -0.01856808 0.490523483 -1.0901348 -1.6500473 0.3450083 -## 6 f -0.80201507 -0.412461197 -1.0846426 0.3817399 0.1657078 -## Telfer_2_4 Telfer_2_5 Telfer_3_4 Telfer_3_5 Telfer_4_5 -## 1 -2.1097232 -0.4557972 -1.1728237 0.8437536 1.4880569 -## 2 -0.6234749 -0.8326960 -0.3171487 -1.1756988 -0.8995878 -## 3 -0.3891040 -0.3595835 0.3549603 -0.2184517 -0.3834038 -## 4 -0.1875890 0.5294236 1.2663488 1.3562488 0.6466352 -## 5 -1.1254544 -1.7153826 -1.8881411 -2.1972910 -1.0810351 -## 6 -0.5122655 0.8827473 -0.8383498 0.4662370 1.3111555 -``` - -## Reporting Rate Models - -The reporting rates models in sparta are all either GLMs or GLMMs with year as a continuous covariate but are flexible, giving the user a number of options for their analyses. These options include the addition of covariates to account for biases in the data including a random site effect and fixed effect of list length. - -In [Isaac et al (2014)](http://onlinelibrary.wiley.com/doi/10.1111/2041-210X.12254/abstract) it was shown that reporting rate models can be susceptible to type 1 errors under certain scenarios and that with site and list length covariates the models performed better when the data were bias. These methods were found to out perform simple methods like Telfer. - -The common feature among these models is that the quantity under consideration is the 'probability of being recorded'. When binomial models are used (as is the default), it's the 'probability for an average visit' for the Bernoulli version it is the probability of being recorded per time period. - -### Data selection - -Before undertaking modelling the data can be subset in an effort to remove data that may introduce bias. Model sub-setting was found to reduce power in [Isaac et al (2014)](http://onlinelibrary.wiley.com/doi/10.1111/2041-210X.12254/abstract) but can partially deal with uneven sampling of site. This process can also be used with other methods and is not solely applicable to the reporting rate models. - -The first function allows you to subset your data by list length. This works out, for each combination of 'where' and 'when' (a visit), the number of species observed (list length). Any records that to not come from a list that meets your list length criteria are then dropped. - - -```r -# Select only records which occur on lists of length 2 or more -myDataL <- siteSelectionMinL(taxa = myData$taxa, - site = myData$site, - time_period = myData$time_period, - minL = 2) -``` - -``` -## Warning in errorChecks(taxa, site, time_period): 94 out of 8000 -## observations will be removed as duplicates -``` - -```r -head(myDataL) -``` - -``` -## taxa site time_period -## 1 u A1 1952-11-16 -## 2 n A1 1952-11-16 -## 3 x A1 1960-06-06 -## 4 s A1 1960-06-06 -## 5 x A1 1999-08-03 -## 6 d A1 1999-08-03 -``` - -```r -# We now have a much smaller dataset after subsetting -nrow(myData) -``` - -``` -## [1] 8000 -``` - -```r -nrow(myDataL) -``` - -``` -## [1] 3082 -``` - -We are also able to subset by the number of times a site is sampled. The function `siteSelectionMinTP` does this. When time_period is a date, as in this case, minTP is minimum number of years a site must be sampled in for it be included in the subset. - - -```r -# Select only data from sites sampled in at least 10 years -myDataTP <- siteSelectionMinTP(taxa = myData$taxa, - site = myData$site, - time_period = myData$time_period, - minTP = 10) -``` - -``` -## Warning in errorChecks(taxa, site, time_period): 94 out of 8000 -## observations will be removed as duplicates -``` - -```r -head(myDataTP) -``` - -``` -## taxa site time_period -## 1 r A51 1970-01-14 -## 2 v A87 1980-09-29 -## 3 e A56 1996-04-14 -## 4 z A28 1959-01-16 -## 5 r A77 1970-09-21 -## 6 x A48 1990-02-25 -``` - -```r -# Here we have only lost a small number rows, this is because -# many sites in our data are visited in a lot of years. Those -# rows that have been removed are duplicates -nrow(myData) -``` - -``` -## [1] 8000 -``` - -```r -nrow(myDataTP) -``` - -``` -## [1] 7906 -``` - -As you can see in the above example minTP specifies the number of years a site must be sampled in order to be included. However, our dataset is very well sampled so we might be interested in another measure of time. For example, you might want only sites that have been observed in at least 60 months. Let's see how this could be done. - - -```r -# We need to create a new column to represent unique months -# this could also be any unit of time you wanted (week, decade, etc.) - -# This line returns a unique character for each month -unique_Months <- format(myData$time_period, "%B_%Y") -head(unique_Months) -``` - -``` -## [1] "January_1970" "September_1980" "April_1996" "January_1959" -## [5] "September_1970" "February_1990" -``` - -```r -# Week could be done like this, see ?strptime for more details -unique_Weeks <- format(myData$time_period, "%U_%Y") -head(unique_Weeks) -``` - -``` -## [1] "02_1970" "39_1980" "15_1996" "02_1959" "38_1970" "08_1990" -``` - -```r -# Now lets subset to records found on 60 months or more -myData60Months <- siteSelectionMinTP(taxa = myData$taxa, - site = myData$site, - time_period = unique_Months, - minTP = 60) -``` - -``` -## Warning in errorChecks(taxa, site, time_period): 129 out of 8000 -## observations will be removed as duplicates -``` - -```r -head(myData60Months) -``` - -``` -## taxa site time_period -## 1 r A51 January_1970 -## 2 v A87 September_1980 -## 3 e A56 April_1996 -## 5 r A77 September_1970 -## 6 x A48 February_1990 -## 7 t A59 January_1981 -``` - -```r -# We could merge this back with our original data if -# we need to retain the full dates -myData60Months <- merge(myData60Months, myData$time_period, - all.x = TRUE, all.y = FALSE, - by = "row.names") -head(myData60Months) -``` - -``` -## Row.names taxa site time_period y -## 1 1 r A51 January_1970 1970-01-14 -## 2 10 w A81 June_1982 1982-06-19 -## 3 100 v A91 January_1996 1996-01-29 -## 4 1000 h A94 May_1990 1981-01-17 -## 5 1001 m A73 March_1999 1990-05-18 -## 6 1002 b A59 July_1997 1999-03-05 -``` - -```r -nrow(myData) -``` - -``` -## [1] 8000 -``` - -```r -nrow(myData60Months) -``` - -``` -## [1] 5289 -``` - -Following the method in Roy et al (2012) we can combine these two functions to subset both by the length of lists and by the number of years that sites are sampled. This has been wrapped up in to the function `siteSelection` which takes all the arguments of the previous two functions plus the argument `LFirst` which indicates whether the data should be subset by list length first (`TRUE`) or second (`FALSE`). - - -```r -# Subset our data as above but in one go -myDataSubset <- siteSelection(taxa = myData$taxa, - site = myData$site, - time_period = myData$time_period, - minL = 2, - minTP = 10, - LFirst = TRUE) -``` - -``` -## Warning in errorChecks(taxa, site, time_period): 94 out of 8000 -## observations will be removed as duplicates -``` - -```r -head(myDataSubset) -``` - -``` -## taxa site time_period -## 11 y A100 1950-01-04 -## 12 k A100 1950-01-04 -## 13 l A100 1954-01-30 -## 14 o A100 1954-01-30 -## 15 s A100 1954-01-30 -## 16 m A100 1956-02-02 -``` - -```r -nrow(myDataSubset) -``` - -``` -## [1] 2587 -``` - -### Running Reporting Rate Models - -Once you have subset your data using the above functions (or perhaps not at all) the reporting rate models can be applied using the function `reportingRateModel`. This function offers flexibility in the model you wish to fit, allowing the user to specify whether list length and site should be used as covariates, whether over-dispersion should be used, and whether the family should be binomial or Bernoulli. A number of these variants are presented in [Isaac et al (2014)](http://onlinelibrary.wiley.com/doi/10.1111/2041-210X.12254/abstract). While multi-species data is required it is not nessecary to model all species. In fact you can save a significant amount of time by only modelling hte species you are interested in. - - -```r -# Run the reporting rate model using list length as a fixed effect and -# site as a random effect. Here we only model a few species. -system.time({ -RR_out <- reportingRateModel(taxa = myData$taxa, - site = myData$site, - time_period = myData$time_period, - list_length = TRUE, - site_effect = TRUE, - species_to_include = c('e','u','r','o','t','a','s'), - overdispersion = FALSE, - family = 'Bernoulli', - print_progress = TRUE) -}) -``` - -``` -## Warning in errorChecks(taxa = taxa, site = site, time_period = -## time_period, : 94 out of 8000 observations will be removed as duplicates -``` - -``` -## Modelling e - Species 1 of 7 -## Modelling u - Species 2 of 7 -## Modelling r - Species 3 of 7 -## Modelling o - Species 4 of 7 -## Modelling t - Species 5 of 7 -## Modelling a - Species 6 of 7 -## Modelling s - Species 7 of 7 -``` - -``` -## user system elapsed -## 11.44 0.00 11.46 -``` - -```r -# Let's have a look at the data that is returned -str(RR_out) -``` - -``` -## 'data.frame': 7 obs. of 14 variables: -## $ species_name : Factor w/ 7 levels "e","u","r","o",..: 1 2 3 4 5 6 7 -## $ intercept.estimate : num -4.53 -3.52 -3.32 -3.63 -3.68 ... -## $ year.estimate : num -0.005811 -0.006944 -0.003321 0.000264 -0.004033 ... -## $ listlength.estimate: num 0.574 0.702 0.472 0.572 0.717 ... -## $ intercept.stderror : num 0.185 0.113 0.123 0.129 0.116 ... -## $ year.stderror : num 0.00583 0.00342 0.00359 0.00384 0.00361 ... -## $ listlength.stderror: num 0.1092 0.0659 0.0754 0.0759 0.0683 ... -## $ intercept.zvalue : num -24.5 -31.3 -27.1 -28.2 -31.6 ... -## $ year.zvalue : num -0.9961 -2.0324 -0.9244 0.0688 -1.1177 ... -## $ listlength.zvalue : num 5.25 10.65 6.26 7.54 10.49 ... -## $ intercept.pvalue : num 2.34e-132 1.58e-214 6.06e-162 1.07e-174 3.78e-219 ... -## $ year.pvalue : num 0.3192 0.0421 0.3553 0.9452 0.2637 ... -## $ listlength.pvalue : num 1.51e-07 1.68e-26 3.76e-10 4.78e-14 9.57e-26 ... -## $ observations : num 144 450 398 346 398 73 426 -## - attr(*, "intercept_year")= num 1974 -## - attr(*, "min_year")= num -24.5 -## - attr(*, "max_year")= num 24.5 -## - attr(*, "nVisits")= int 6211 -## - attr(*, "model_formula")= chr "taxa ~ year + listLength + (1|site)" -``` - -```r -# We could plot these to see the species trends -with(RR_out, - # Plot graph - {plot(x = 1:7, y = year.estimate, - ylim = range(c(year.estimate - year.stderror, - year.estimate + year.stderror)), - ylab = 'Year effect (+/- Std Dev)', - xlab = 'Species', - xaxt = "n") - # Add x-axis with species names - axis(1, at = 1:7, labels = species_name) - # Add the error bars - arrows(1:7, year.estimate - year.stderror, - 1:7, year.estimate + year.stderror, - length = 0.05, angle = 90, code = 3)} - ) -``` - -![](unnamed-chunk-15-1.png) - -The returned object is a data frame with one row per species. Each column gives information on an element of the model output including covariate estimates, standard errors and p-values. This object also has some attributes giving the year that was chosen as the intercept, the number of visits in the dataset and the model formula used. - -These models can take a long time to run when your data set is larg or you have a large number of species to model. To make this faster it is possible to parallelise this process across species which can significantly improve your run times. Here is an example of how we would parallelise the above example using hte R package snowfall. - - -```r -# Load in snowfall -library(snowfall) -``` - -``` -## Loading required package: snow -``` - -```r -# I have 4 cpus on my PC so I set cpus to 4 -# when I initialise the cluster -sfInit(parallel = TRUE, cpus = 4) -``` - -``` -## Warning in searchCommandline(parallel, cpus = cpus, type = type, -## socketHosts = socketHosts, : Unknown option on commandline: -## rmarkdown::render('W:/PYWELL_SHARED/Pywell Projects/BRC/Tom August/R -## Packages/Trend analyses/sparta/pre_vignette/sparta_vignette.Rmd', encoding -``` - -``` -## R Version: R version 3.2.0 (2015-04-16) -``` - -``` -## snowfall 1.84-6 initialized (using snow 0.3-13): parallel execution on 4 CPUs. -``` - -```r -# Export my data to the cluster -sfExport('myData') - -# I create a function that takes a species name and runs my models -RR_mod_function <- function(taxa_name){ - - library(sparta) - - RR_out <- reportingRateModel(species_to_include = taxa_name, - taxa = myData$taxa, - site = myData$site, - time_period = myData$time_period, - list_length = TRUE, - site_effect = TRUE, - overdispersion = FALSE, - family = 'Bernoulli', - print_progress = FALSE) -} - -# I then run this in parallel -system.time({ -para_out <- sfClusterApplyLB(c('e','u','r','o','t','a','s'), RR_mod_function) -}) -``` - -``` -## user system elapsed -## 0.00 0.00 7.21 -``` - -```r -# Name each element of this output by the species -RR_out_combined <- do.call(rbind, para_out) - -# Stop the cluster -sfStop() -``` - -``` -## -## Stopping cluster -``` - -```r -# You'll see the output is the same as when we did it serially but the -# time taken is shorter. Using a cluster computer with many more than -# 4 cores can greatly reduce run time. -str(RR_out_combined) -``` - -``` -## 'data.frame': 7 obs. of 14 variables: -## $ species_name : Factor w/ 7 levels "e","u","r","o",..: 1 2 3 4 5 6 7 -## $ intercept.estimate : num -4.53 -3.52 -3.32 -3.63 -3.68 ... -## $ year.estimate : num -0.005811 -0.006944 -0.003321 0.000264 -0.004033 ... -## $ listlength.estimate: num 0.574 0.702 0.472 0.572 0.717 ... -## $ intercept.stderror : num 0.185 0.113 0.123 0.129 0.116 ... -## $ year.stderror : num 0.00583 0.00342 0.00359 0.00384 0.00361 ... -## $ listlength.stderror: num 0.1092 0.0659 0.0754 0.0759 0.0683 ... -## $ intercept.zvalue : num -24.5 -31.3 -27.1 -28.2 -31.6 ... -## $ year.zvalue : num -0.9961 -2.0324 -0.9244 0.0688 -1.1177 ... -## $ listlength.zvalue : num 5.25 10.65 6.26 7.54 10.49 ... -## $ intercept.pvalue : num 2.34e-132 1.58e-214 6.06e-162 1.07e-174 3.78e-219 ... -## $ year.pvalue : num 0.3192 0.0421 0.3553 0.9452 0.2637 ... -## $ listlength.pvalue : num 1.51e-07 1.68e-26 3.76e-10 4.78e-14 9.57e-26 ... -## $ observations : num 144 450 398 346 398 73 426 -## - attr(*, "intercept_year")= num 1974 -## - attr(*, "min_year")= num -24.5 -## - attr(*, "max_year")= num 24.5 -## - attr(*, "nVisits")= int 6211 -## - attr(*, "model_formula")= chr "taxa ~ year + listLength + (1|site)" -``` - -Using these functions it is possible to recreate the 'Well-sampled sites' method that is presented in [Roy et al (2012)](http://onlinelibrary.wiley.com/doi/10.1111/j.1472-4642.2012.00883.x/abstract) and [Thomas et al (2015)](http://onlinelibrary.wiley.com/doi/10.1111/bij.12527/full). This is made available in the function `WSS` which is a simple wrapper around `siteSelection` and `reportingratemodel`. In this variant the data is subset by list length and the number of years each site was sampled before being run in a GLMM with site as a random effect. - - -```r -# Run our data through the well-sampled sites function -# This time we run all species -WSS_out <- WSS(taxa = myData$taxa, - site = myData$site, - time_period = myData$time_period, - minL = 2, - minTP = 10, - print_progress = FALSE) -``` - -``` -## Warning in errorChecks(taxa, site, time_period): 94 out of 8000 -## observations will be removed as duplicates -``` - -```r -# The data is returned in the same format as from reportingRateModel -str(WSS_out) -``` - -``` -## 'data.frame': 26 obs. of 10 variables: -## $ species_name : Factor w/ 26 levels "r","v","e","z",..: 1 2 3 4 5 6 7 8 9 10 ... -## $ intercept.estimate: num -2.29 -1.85 -3.17 -1.81 -1.75 ... -## $ year.estimate : num -0.00912 0.0012 0.00158 0.00143 -0.00247 ... -## $ intercept.stderror: num 0.1021 0.0861 0.1875 0.0848 0.0829 ... -## $ year.stderror : num 0.00684 0.00574 0.00973 0.00565 0.00554 ... -## $ intercept.zvalue : num -22.4 -21.5 -16.9 -21.3 -21.1 ... -## $ year.zvalue : num -1.334 0.208 0.163 0.253 -0.446 ... -## $ intercept.pvalue : num 1.70e-111 1.66e-102 6.55e-64 6.87e-101 1.06e-98 ... -## $ year.pvalue : num 0.182 0.835 0.871 0.8 0.656 ... -## $ observations : num 106 157 50 163 171 148 125 155 61 104 ... -## - attr(*, "intercept_year")= num 1974 -## - attr(*, "min_year")= num -24.5 -## - attr(*, "max_year")= num 24.5 -## - attr(*, "nVisits")= int 1155 -## - attr(*, "model_formula")= chr "cbind(successes, failures) ~ year + (1|site)" -## - attr(*, "minL")= num 2 -## - attr(*, "minTP")= num 10 -``` - -```r -# We can plot these and see that we get different results to our -# previous analysis since this time the method includes subsetting -with(WSS_out[1:10,], - # Plot graph - {plot(x = 1:10, y = year.estimate, - ylim = range(c(year.estimate - year.stderror, - year.estimate + year.stderror)), - ylab = 'Year effect (+/- Std Dev)', - xlab = 'Species', - xaxt="n") - # Add x-axis with species names - axis(1, at=1:10, labels = species_name[1:10]) - # Add the error bars - arrows(1:10, year.estimate - year.stderror, - 1:10, year.estimate + year.stderror, - length=0.05, angle=90, code=3)} - ) -``` - -![](unnamed-chunk-17-1.png) - -## Occupancy models - -Occupancy models were found by [Isaac et al (2014)](http://onlinelibrary.wiley.com/doi/10.1111/2041-210X.12254/abstract) to be one of the best tools for analysing species occurrence data typical of citizen science projects, being both robust and powerful. This method models the occupancy process separately from the detection process, but we will not go in to the details of the model here since there is a growing literature about occupancy models, how and when they should be used. Here we focus on how the occupancy model discussed in Isaac et al 2014 is implemented in `sparta`. - -This function works in a very similar fashion to that of the previous functions we have discussed. The data it takes is 'What, where, when' as in other functions, however here we have the option to specify which species we wish to model. This feature has been added as occupancy models are computationally intensive. The parameters of the function allow you control over the number of iterations, burnin, thinning, the number of chains, the seed and for advanced users there is also the possibility to pass in your own BUGS script. - - -```r -# Here is our data -str(myData) -``` - -``` -## 'data.frame': 8000 obs. of 4 variables: -## $ taxa : Factor w/ 26 levels "a","b","c","d",..: 18 22 5 26 18 24 20 24 17 23 ... -## $ site : Factor w/ 100 levels "A1","A10","A100",..: 48 87 53 22 76 44 56 66 92 81 ... -## $ time_period: Date, format: "1970-01-14" "1980-09-29" ... -## $ tp : int 3 4 5 1 3 5 4 1 5 4 ... -``` - -```r -# Run an occupancy model for three species -# Here we use very small number of iterations -# to avoid a long run time -system.time({ -occ_out <- occDetModel(taxa = myData$taxa, - site = myData$site, - time_period = myData$time_period, - species_list = c('a','b','c','d'), - write_results = FALSE, - n_iterations = 200, - burnin = 15, - n_chains = 3, - thinning = 3, - seed = 123) -}) -``` - -``` -## Warning in errorChecks(taxa = taxa, site = site, time_period = -## time_period): 94 out of 8000 observations will be removed as duplicates -``` - -``` -## -## ### -## Modeling a - 1 of 4 taxa -``` - -``` -## module glm loaded -``` - -``` -## Compiling model graph -## Resolving undeclared variables -## Allocating nodes -## Graph Size: 64272 -## -## Initializing model -## -## -## ### -## Modeling b - 2 of 4 taxa -## Compiling model graph -## Resolving undeclared variables -## Allocating nodes -## Graph Size: 64306 -## -## Initializing model -## -## -## ### -## Modeling c - 3 of 4 taxa -## Compiling model graph -## Resolving undeclared variables -## Allocating nodes -## Graph Size: 64308 -## -## Initializing model -## -## -## ### -## Modeling d - 4 of 4 taxa -## Compiling model graph -## Resolving undeclared variables -## Allocating nodes -## Graph Size: 64328 -## -## Initializing model -``` - -``` -## user system elapsed -## 70.20 0.08 70.53 -``` - -```r -# Lets look at the results -## The object returned is a list with one element for each species -names(occ_out) -``` - -``` -## [1] "a" "b" "c" "d" -``` - -```r -# Each of these is an object of class 'occDet' -class(occ_out$a) -``` - -``` -## [1] "occDet" -``` - -```r -# Inside these elements is the information of interest -names(occ_out$a) -``` - -``` -## [1] "model" "BUGSoutput" "parameters.to.save" -## [4] "model.file" "n.iter" "DIC" -## [7] "SPP_NAME" "min_year" "max_year" -``` - -```r -# Of particular interest to many users will be the summary -# data in the BUGSoutput -head(occ_out$a$BUGSoutput$summary) -``` - -``` -## mean sd 2.5% 25% 50% -## LL.p 0.2691009 0.3102804 -0.3519707 0.04895419 0.2556833 -## deviance 649.7160858 51.1319847 544.5524433 614.12892478 666.5255290 -## fit 281.9522389 89.1997805 190.9883913 219.13112637 247.9393552 -## fit.new 283.9149866 90.5760094 185.6004878 217.89008520 254.7165849 -## mean_early 0.3701971 0.1718789 0.1274588 0.25082920 0.3516627 -## mean_late 0.4114516 0.1241131 0.1808044 0.35083037 0.4000000 -## 75% 97.5% Rhat n.eff -## LL.p 0.4929832 0.8780177 0.993288 190 -## deviance 688.2839296 710.7614446 1.031683 64 -## fit 316.7603165 499.9895976 1.009975 120 -## fit.new 319.0321333 504.5251365 1.006007 160 -## mean_early 0.4283236 0.7837351 1.841981 5 -## mean_late 0.4591644 0.7623305 1.295740 12 -``` - -```r -# We have included a plotting feature for objects of class -# occDet which provides a useful visualisation of the trend -# in occupancy over time -plot(occ_out$a) -``` - -![](unnamed-chunk-18-1.png) - -He we have run a small example but in reality these models are usually run for many thousands of iterations, making the analysis of more than a handful of species impractical. For those with access to the necessary facilities it is possible to parallelise across species. To do this we use a pair of functions that are used internally by `occDetModel`. These are `formatOccData` which is used to format our occurrence data into the format needed by JAGS, and `occDetFunc`, the function which undertakes the modelling. - - -```r -# First format our data -formattedOccData <- formatOccData(taxa = myData$taxa, - site = myData$site, - time_period = myData$time_period) -``` - -``` -## Warning in errorChecks(taxa = taxa, site = site, time_period = -## time_period): 94 out of 8000 observations will be removed as duplicates -``` - -```r -# This is a list of two elements -names(formattedOccData) -``` - -``` -## [1] "spp_vis" "occDetdata" -``` - -`formatOccData` returns a list of length 2; the first element 'spp_vis' is a data.frame with visit (unique combination of site and time period) in the first column and taxa for all the following columns. Values in taxa columns are either TRUE or FALSE depending on whether they were observed on that visit. - - -```r -# Lets have a look at spp_vis -head(formattedOccData$spp_vis[,1:5]) -``` - -``` -## visit a b c d -## 1 A1001950-01-04 FALSE FALSE FALSE FALSE -## 2 A1001950-11-01 TRUE FALSE FALSE FALSE -## 3 A1001951-08-25 FALSE FALSE FALSE FALSE -## 4 A1001951-11-03 FALSE FALSE FALSE FALSE -## 5 A1001952-02-07 FALSE FALSE FALSE FALSE -## 6 A1001953-02-22 FALSE FALSE FALSE FALSE -``` - -The second element ('occDetData') is a data frame giving the site, list length (the number of species observed on a visit) and year for each visit. - - -```r -# Lets have a look at occDetData -head(formattedOccData$occDetdata) -``` - -``` -## visit site L year -## 1 A1001950-01-04 A100 2 1950 -## 3 A1001950-11-01 A100 1 1950 -## 4 A1001951-08-25 A100 1 1951 -## 5 A1001951-11-03 A100 1 1951 -## 6 A1001952-02-07 A100 1 1952 -## 7 A1001953-02-22 A100 1 1953 -``` - -With our data in the correct format this can now go into the modelling function - - - -```r -# Use the occupancy modelling function to parrellise the process -# Here we are going to use the package snowfall -library(snowfall) - -# I have 4 cpus on my PC so I set cpus to 4 -# when I initialise the cluster -sfInit(parallel = TRUE, cpus = 4) -``` - -``` -## Warning in searchCommandline(parallel, cpus = cpus, type = type, -## socketHosts = socketHosts, : Unknown option on commandline: -## rmarkdown::render('W:/PYWELL_SHARED/Pywell Projects/BRC/Tom August/R -## Packages/Trend analyses/sparta/pre_vignette/sparta_vignette.Rmd', encoding -``` - -``` -## snowfall 1.84-6 initialized (using snow 0.3-13): parallel execution on 4 CPUs. -``` - -```r -# Export my data to the cluster -sfExport('formattedOccData') - -# I create a function that takes a species name and runs my model -occ_mod_function <- function(taxa_name){ - - library(sparta) - - occ_out <- occDetFunc(taxa_name = taxa_name, - n_iterations = 200, - burnin = 15, - occDetdata = formattedOccData$occDetdata, - spp_vis = formattedOccData$spp_vis, - write_results = FALSE, - seed = 123) -} - -# I then run this in parallel -system.time({ -para_out <- sfClusterApplyLB(c('a','b','c','d'), occ_mod_function) -}) -``` - -``` -## user system elapsed -## 0.02 0.01 25.95 -``` - -```r -# Name each element of this output by the species -for(i in 1:length(para_out)) names(para_out)[i] <- para_out[[i]]$SPP_NAM - -# Stop the cluster -sfStop() -``` - -``` -## -## Stopping cluster -``` - -```r -# This takes about half the time of the -# serial version we ran earlier, and the resulting object -# is the same (since we set the random seed to be the same -# in each) -head(para_out$a$BUGSoutput$summary) -``` - -``` -## mean sd 2.5% 25% 50% -## LL.p 0.2691009 0.3102804 -0.3519707 0.04895419 0.2556833 -## deviance 649.7160858 51.1319847 544.5524433 614.12892478 666.5255290 -## fit 281.9522389 89.1997805 190.9883913 219.13112637 247.9393552 -## fit.new 283.9149866 90.5760094 185.6004878 217.89008520 254.7165849 -## mean_early 0.3708781 0.1715081 0.1307932 0.25000000 0.3466667 -## mean_late 0.4106272 0.1219015 0.1887431 0.35000000 0.3966667 -## 75% 97.5% Rhat n.eff -## LL.p 0.4929832 0.8780177 0.993288 190 -## deviance 688.2839296 710.7614446 1.031683 64 -## fit 316.7603165 499.9895976 1.009975 120 -## fit.new 319.0321333 504.5251365 1.006007 160 -## mean_early 0.4349904 0.7779150 1.853765 5 -## mean_late 0.4533333 0.7547538 1.302426 11 -``` - -```r -plot(para_out$a) -``` - -![](unnamed-chunk-22-1.png) - -This same approach can be used on cluster computers, which can have hundreds of processors, to dramatically reduce run times. - -## Frescalo - -The frescalo method is outlined in [Hill (2012)](http://onlinelibrary.wiley.com/doi/10.1111/j.2041-210X.2011.00146.x/suppinfo) and is a means to account for both spatial and temporal bias. This method was shown by [Isaac et al (2014)](http://onlinelibrary.wiley.com/doi/10.1111/2041-210X.12254/abstract) to be a good method for data that is aggregated into time periods such as when comparing atlases. The frescalo method is run using a .exe, you will need to download this file by visiting this link - [https://github.com/BiologicalRecordsCentre/frescalo](https://github.com/BiologicalRecordsCentre/frescalo). Once you have downloaded the .exe make a note of the directory you have placed it in, we will need that in a moment. - -Again we will assume that your data is in a 'what, where, when' format similar to that we used in the previous method: - - -```r -head(myData) -``` - -``` -## taxa site time_period tp -## 1 r A51 1970-01-14 3 -## 2 v A87 1980-09-29 4 -## 3 e A56 1996-04-14 5 -## 4 z A28 1959-01-16 1 -## 5 r A77 1970-09-21 3 -## 6 x A48 1990-02-25 5 -``` - -Frescalo's requirements in terms of data structure and types is a little different to that we have seen in other functions. Firstly the entire data.frame is passed in as an argument called `Data`, and the column names of your various elements (taxa, site, etc) are given as other arguments. Secondly frescalo requires that the 'when' component is either a column of year or two columns, one of 'start date' and one of 'end date'. Our data as presented above does not fit into this format so first we must reformat it. In our situation the simplest thing to do is to add a column giving the year. Since frescalo aggregates across time periods (often decades or greater) this loss of temporal resolution is not an issue. - - -```r -# Add a year column -myData$year <- as.numeric(format(myData$time_period, '%Y')) -head(myData) -``` - -``` -## taxa site time_period tp year -## 1 r A51 1970-01-14 3 1970 -## 2 v A87 1980-09-29 4 1980 -## 3 e A56 1996-04-14 5 1996 -## 4 z A28 1959-01-16 1 1959 -## 5 r A77 1970-09-21 3 1970 -## 6 x A48 1990-02-25 5 1990 -``` - -Now we have our data in the correct format for frescalo there is one other major component we need, a weights file. You can find out more about the weights file and what it is used for in the original paper [(Hill, 2012)](http://onlinelibrary.wiley.com/doi/10.1111/j.2041-210X.2011.00146.x/suppinfo). In short the weights file outlines the similarity between sites in your dataset. This information is used to weight the analysis of each site accordingly. If you are undertaking this analysis in the UK at 10km square resolution there are some built in weights files you can use. Some of these weights files use the UK landcover map instead of floristic similarity (as used in [Hill (2012)](http://onlinelibrary.wiley.com/doi/10.1111/j.2041-210X.2011.00146.x/suppinfo)). You can find out more about these in the frescalo help file. - -For the sake of demonstration let us assume that you do not have a weights file for your analysis, or that you want to create your own. To create a weights file you need two things, a measure of physical distance between your sites and a measure of similarity. In the original paper this similarity measure was floristic similarity, but it could also be habitat similarity or whatever is relevant for the taxa you are studying. In this example I have a table of distances and of land cover proportions at each site - - - - -```r -# Here is the distance table -head(myDistances) -``` - -``` -## x y dist -## 1 A51 A51 0.000 -## 2 A87 A51 4074.258 -## 3 A56 A51 6595.711 -## 4 A28 A51 1531.943 -## 5 A77 A51 5732.942 -## 6 A48 A51 2394.873 -``` - -```r -# Here is our habitat data -head(myHabitatData) -``` - -``` -## site grassland woodland heathland urban freshwater -## 1 A51 0.1169123 0.1084992 0.28376157 0.37312774 0.11769919 -## 2 A87 0.1781151 0.1307214 0.35258119 0.26223604 0.07634632 -## 3 A56 0.2359391 0.1263644 0.25898930 0.13490734 0.24379991 -## 4 A28 0.3100922 0.1373896 0.20870313 0.28659095 0.05722412 -## 5 A77 0.2034073 0.4897063 0.05368464 0.01677132 0.23643036 -## 6 A48 0.2397599 0.1046128 0.34250853 0.13055663 0.18256221 -``` - -```r -# With our distance and habitat tables in hand we can -# use the createWeights function to build our weights file -# I have changed the defualts of dist_sub and sim_sub since -# we have a very small example dataset of only 50 sites -myWeights <- createWeights(distances = myDistances, - attributes = myHabitatData, - dist_sub = 20, - sim_sub = 10) -``` - -``` -## Creating similarity distance table...Complete -## Creating weights file... -## 0% -## 10% -## 20% -## 30% -## 40% -## 50% -## 60% -## 70% -## 80% -## 90% -## 100% -## Complete -``` - -```r -head(myWeights) -``` - -``` -## target neighbour weight -## 1 A51 A2 0.0311 -## 2 A51 A47 0.1150 -## 3 A51 A49 0.0012 -## 4 A51 A51 1.0000 -## 5 A51 A53 0.0160 -## 6 A51 A62 0.2687 -``` - - -The `createWeights` function follows the procedure outlined in [Hill (2012)](http://onlinelibrary.wiley.com/doi/10.1111/j.2041-210X.2011.00146.x/suppinfo) for creating weights and more information can be found in the help file of the function. With our data and weights file we are now ready to proceed with frescalo. As with other functions frescalo can take a range of additional arguments which you can see by entering `?frescalo` at the console, here we will do a minimal example. - - -```r -# First we need to enter the location where we placed the .exe -# In my case I saved it to my documents folder -myFrescaloPath <- 'C:/Users/tomaug/Documents/Frescalo_3a_windows.exe' - -# I then want to set up the time periods I want to analyse -# Here I say I want to compare 1980-89 to 1990-99 -myTimePeriods <- data.frame(start = c(1980, 1990), end = c(1989, 1999)) -head(myTimePeriods) -``` - -``` -## start end -## 1 1980 1989 -## 2 1990 1999 -``` - -```r -# I also need to specify where I want my results to be saved -# I'm going to save it in a folder in my working directory -myFolder <- '~/myFolder' - -# Simple run of frescalo -frescalo_results <- frescalo(Data = myData, - frespath = myFrescaloPath, - time_periods = myTimePeriods, - site_col = 'site', - sp_col = 'taxa', - year = 'year', - Fres_weights = myWeights, - sinkdir = myFolder) -``` - -``` -## -## SAVING DATA TO FRESCALO WORKING DIRECTORY -## ******************** -## -## -## RUNNING FRESCALO -## ******************** -``` - -``` -## Warning in run_fresc_file(fres_data = Data, output_dir = fresoutput, -## frescalo_path = frespath, : Your value of phi (0.74) is smaller than the -## 98.5 percentile of input phi (0.89). It is reccommended your phi be similar -## to this value. For more information see Hill (2011) reference in frescalo -## help file -``` - -``` -## Building Species List - Complete -## Outputting Species Results -## Species 1 of 26 - a - 10/07/2015 14:21:05 -## Species 2 of 26 - b - 10/07/2015 14:21:05 -## Species 3 of 26 - c - 10/07/2015 14:21:05 -## Species 4 of 26 - d - 10/07/2015 14:21:05 -## Species 5 of 26 - e - 10/07/2015 14:21:05 -## Species 6 of 26 - f - 10/07/2015 14:21:05 -## Species 7 of 26 - g - 10/07/2015 14:21:05 -## Species 8 of 26 - h - 10/07/2015 14:21:05 -## Species 9 of 26 - i - 10/07/2015 14:21:05 -## Species 10 of 26 - j - 10/07/2015 14:21:05 -## Species 11 of 26 - k - 10/07/2015 14:21:05 -## Species 12 of 26 - l - 10/07/2015 14:21:05 -## Species 13 of 26 - m - 10/07/2015 14:21:05 -## Species 14 of 26 - n - 10/07/2015 14:21:05 -## Species 15 of 26 - o - 10/07/2015 14:21:05 -## Species 16 of 26 - p - 10/07/2015 14:21:05 -## Species 17 of 26 - q - 10/07/2015 14:21:05 -## Species 18 of 26 - r - 10/07/2015 14:21:05 -## Species 19 of 26 - s - 10/07/2015 14:21:05 -## Species 20 of 26 - t - 10/07/2015 14:21:05 -## Species 21 of 26 - u - 10/07/2015 14:21:05 -## Species 22 of 26 - v - 10/07/2015 14:21:05 -## Species 23 of 26 - w - 10/07/2015 14:21:05 -## Species 24 of 26 - x - 10/07/2015 14:21:05 -## Species 25 of 26 - y - 10/07/2015 14:21:05 -## Species 26 of 26 - z - 10/07/2015 14:21:05 -## [1] "frescalo complete" -``` - - -We get a warning from this analysis that our value of phi is too low. In this case this is because our simulated data suggests every species is found on every site in our time periods. This is a little unrealistic but should you get a similar warning with your data you might want to consult [Hill (2012)](http://onlinelibrary.wiley.com/doi/10.1111/j.2041-210X.2011.00146.x/suppinfo) and change your input value of phi. - -The object that is returned (`frescalo_results` in my case) is an object of class `frescalo`. this means there are a couple of special methods we can use with it. - - -```r -# Using 'summary' gives a quick overview of our data -# This can be useful to double check that your data was read in correctly -summary(frescalo_results) -``` - -``` -## Actual numbers in data -## Number of samples 100 -## Number of species 26 -## Number of time periods 2 -## Number of observations 2239 -## Neighbourhood weights 1000 -## Benchmark exclusions 0 -## Filter locations included 0 -``` - -```r -# Using 'print' we get a preview of the results -print(frescalo_results) -``` - -``` -## -## Preview of $paths - file paths to frescalo log, stats, freq, trend .csv files: -## -## [1] "~/myFolder/frescalo_150710/Output/Log.txt" -## [2] "~/myFolder/frescalo_150710/Output/Stats.csv" -## [3] "~/myFolder/frescalo_150710/Output/Freq.csv" -## [4] "~/myFolder/frescalo_150710/Output/Trend.csv" -## [5] "~/myFolder/frescalo_150710/Output/Freq_quickload.txt" -## -## -## Preview of $trend - trends file, giving the tfactor value for each species at each time period: -## -## Species Time TFactor StDev X Xspt Xest N.0.00 N.0.98 -## 1 a 1984.5 0.544 0.201 8 8 8 92 0 -## 2 a 1994.5 1.143 0.302 17 17 17 92 0 -## 3 j 1984.5 1.372 0.237 46 45 45 100 0 -## 4 j 1994.5 0.702 0.133 35 35 35 100 1 -## 5 k 1984.5 0.961 0.167 44 43 43 100 0 -## 6 k 1994.5 0.816 0.144 46 45 45 100 5 -## -## -## Preview of $stat - statistics for each hectad in the analysis: -## -## Location Loc_no No_spp Phi_in Alpha Wgt_n2 Phi_out Spnum_in Spnum_out -## 1 A1 1 11 0.815 0.66 1.58 0.74 11.5 9.8 -## 2 A10 2 13 0.717 1.14 3.77 0.74 14.9 15.7 -## 3 A100 3 18 0.828 0.58 3.01 0.74 18.1 14.9 -## 4 A11 4 16 0.847 0.49 1.81 0.74 16.9 13.3 -## 5 A12 5 11 0.718 1.15 3.22 0.74 14.5 15.3 -## 6 A13 6 8 0.681 1.32 3.88 0.74 14.5 16.4 -## Iter -## 1 15 -## 2 5 -## 3 8 -## 4 9 -## 5 3 -## 6 8 -## -## -## Preview of $freq - rescaled frequencies for each location and species: -## -## Location Species Pres Freq Freq1 SDFrq1 Rank Rank1 -## 1 A1 v 1 0.9778 0.9177 0.1372 1 0.102 -## 2 A1 k 1 0.9722 0.9046 0.1494 2 0.204 -## 3 A1 w 1 0.9634 0.8856 0.1659 3 0.305 -## 4 A1 y 1 0.9563 0.8715 0.1776 4 0.407 -## 5 A1 x 1 0.9412 0.8440 0.1992 5 0.509 -## 6 A1 e 1 0.8965 0.7740 0.2491 6 0.611 -## -## -## Preview of $log - log file: -## -## Number of species 26 -## Number of time periods 2 -## Number of observations 2239 -## Neighbourhood weights 1000 -## Benchmark exclusions 0 -## Filter locations included 0 -## -## -## 98.5 percentile of input phi 0.89 -## Target value of phi 0.74 -## -## -## -## -## Preview of $lm_stats - trends in tfactor over time: -## -## SPECIES NAME b a b_std_err b_tval b_pval a_std_err -## 1 S1 a 0.0599 -118.32755 NA NA NA NA -## 12 S2 b 0.0021 -3.27645 NA NA NA NA -## 20 S3 c 0.0045 -8.04525 NA NA NA NA -## 21 S4 d 0.0365 -71.60625 NA NA NA NA -## 22 S5 e -0.0046 9.96270 NA NA NA NA -## 23 S6 f 0.0326 -63.82470 NA NA NA NA -## a_tval a_pval adj_r2 r2 F_val F_num_df F_den_df Ymin Ymax -## 1 NA NA NA 1 NA 1 0 1984.5 1994.5 -## 12 NA NA NA 1 NA 1 0 1984.5 1994.5 -## 20 NA NA NA 1 NA 1 0 1984.5 1994.5 -## 21 NA NA NA 1 NA 1 0 1984.5 1994.5 -## 22 NA NA NA 1 NA 1 0 1984.5 1994.5 -## 23 NA NA NA 1 NA 1 0 1984.5 1994.5 -## Z_VAL SIG_95 -## 1 1.65116558 FALSE -## 12 0.06358704 FALSE -## 20 0.14903522 FALSE -## 21 1.17443512 FALSE -## 22 -0.17104272 FALSE -## 23 1.12652478 FALSE -``` - -```r -# There is a lot of information here and you can read more about -# what these data mean by looking at the frescalo help file -# The files detailed in paths are also in the object returned -frescalo_results$paths -``` - -``` -## [1] "~/myFolder/frescalo_150710/Output/Log.txt" -## [2] "~/myFolder/frescalo_150710/Output/Stats.csv" -## [3] "~/myFolder/frescalo_150710/Output/Freq.csv" -## [4] "~/myFolder/frescalo_150710/Output/Trend.csv" -## [5] "~/myFolder/frescalo_150710/Output/Freq_quickload.txt" -``` - -```r -names(frescalo_results) -``` - -``` -## [1] "paths" "trend" "stat" "freq" "log" "lm_stats" -``` - -```r -# However we additionally get some model results in our returned object -# under '$lm_stats' -``` - -The results from frescalo may seem complex at first and I suggest reading the Value section of the frescalo help file for details. In brief: `frescalo_results$paths` lists the file paths of the raw data files for `$log`, `$stat`, `$freq` and `$trend`, in that order. `frescalo_results$trend` is a data.frame providing the list of time factors (a measure of probability of occurrence relative to benchmark species) for each species-timeperiod. `frescalo_results$stat` is a data.frame giving details about sites such as estimated species richness. `frescalo_results$freq` is a data.frame of the species frequencies, that is the probabilities that a species was present at a certain location. `frescalo_results$log`, a simple report of the console output from the .exe. `frescalo_results$lm_stats` is a data.frame giving the results of a linear regression of Tfactors for each species when more than two time periods are used. If only 2 time periods are used (as in our example) the linear modeling section of this data.frame is filled with NAs and a z-test is performed instead (results are given in the last columns). - - -```r -# Lets look at some results fo the first three species -frescalo_results$lm_stats[1:3, c('NAME','Z_VAL','SIG_95')] -``` - -``` -## NAME Z_VAL SIG_95 -## 1 a 1.65116558 FALSE -## 12 b 0.06358704 FALSE -## 20 c 0.14903522 FALSE -``` - -```r -# None of these have a significant change using a z-test -# Lets look at the raw data -frescalo_results$trend[frescalo_results$trend$Species %in% c('a', 'b', 'c'), - c('Species', 'Time', 'TFactor', 'StDev')] -``` - -``` -## Species Time TFactor StDev -## 1 a 1984.5 0.544 0.201 -## 2 a 1994.5 1.143 0.302 -## 23 b 1984.5 0.891 0.237 -## 24 b 1994.5 0.912 0.230 -## 39 c 1984.5 0.885 0.215 -## 40 c 1994.5 0.930 0.212 -``` - -```r -# We can see from these results that the big standard deviations on -# the tfactor values means there is no real difference between the -# two time periods -``` - -If your data are from the UK and sites are given as grid referenes that there is functionality to plot a simple output of your results - - -```r -# This only works with UK grid references -# We can load an example dataset from the UK -data(unicorns) -head(unicorns) -``` - -``` -## TO_STARTDATE Date hectad kmsq CONCEPT -## 1 1968-08-06 00:00:00 1968-08-06 00:00:00 NZ28 Species 18 -## 2 1951-05-12 00:00:00 1951-05-12 00:00:00 SO34 Species 11 -## 3 1946-05-06 00:00:00 1946-05-06 00:00:00 SO34 SO3443 Species 11 -## 4 1980-05-01 00:00:00 1980-05-31 00:00:00 SO34 Species 11 -## 5 1829-12-31 23:58:45 1830-12-30 23:58:45 SH48 Species 11 -## 6 1981-06-21 00:00:00 1981-06-21 00:00:00 SO37 Species 11 -``` - -```r -# Now run frescalo using hte built in weights file -unicorn_results <- frescalo(Data = unicorns, - frespath = myFrescaloPath, - time_periods = myTimePeriods, - site_col = 'hectad', - sp_col = 'CONCEPT', - start_col = 'TO_STARTDATE', - end_col = 'Date', - sinkdir = myFolder) -``` - -``` -## Warning in weights_data_matchup(weights_sites = unpacked$site_names, -## data_sites = unique(Data$site)): 1 sites appear your data but are not in -## your weights file, these will be ignored -``` - -``` -## Warning in frescalo(Data = unicorns, frespath = myFrescaloPath, -## time_periods = myTimePeriods, : sinkdir already contains frescalo output. -## New data saved in ~/myFolder/frescalo_150710(2) -``` - -``` -## -## SAVING DATA TO FRESCALO WORKING DIRECTORY -## ******************** -## -## -## RUNNING FRESCALO -## ******************** -## -## Building Species List - Complete -## Outputting Species Results -## Species 1 of 55 - Species 1 - 10/07/2015 15:40:46 -## Species 2 of 55 - Species 10 - 10/07/2015 15:40:46 -## Species 3 of 55 - Species 11 - 10/07/2015 15:40:46 -## Species 4 of 55 - Species 12 - 10/07/2015 15:40:46 -## Species 5 of 55 - Species 13 - 10/07/2015 15:40:46 -## Species 6 of 55 - Species 14 - 10/07/2015 15:40:46 -## Species 7 of 55 - Species 15 - 10/07/2015 15:40:46 -## Species 8 of 55 - Species 16 - 10/07/2015 15:40:46 -## Species 9 of 55 - Species 17 - 10/07/2015 15:40:46 -## Species 10 of 55 - Species 18 - 10/07/2015 15:40:46 -## Species 11 of 55 - Species 19 - 10/07/2015 15:40:46 -## Species 12 of 55 - Species 2 - 10/07/2015 15:40:46 -## Species 13 of 55 - Species 20 - 10/07/2015 15:40:46 -## Species 14 of 55 - Species 21 - 10/07/2015 15:40:46 -## Species 15 of 55 - Species 22 - 10/07/2015 15:40:46 -## Species 16 of 55 - Species 23 - 10/07/2015 15:40:46 -## Species 17 of 55 - Species 24 - 10/07/2015 15:40:46 -## Species 18 of 55 - Species 25 - 10/07/2015 15:40:46 -## Species 19 of 55 - Species 27 - 10/07/2015 15:40:46 -## Species 20 of 55 - Species 28 - 10/07/2015 15:40:46 -## Species 21 of 55 - Species 29 - 10/07/2015 15:40:46 -## Species 22 of 55 - Species 3 - 10/07/2015 15:40:46 -## Species 23 of 55 - Species 30 - 10/07/2015 15:40:46 -## Species 24 of 55 - Species 31 - 10/07/2015 15:40:46 -## Species 25 of 55 - Species 32 - 10/07/2015 15:40:46 -## Species 26 of 55 - Species 33 - 10/07/2015 15:40:46 -## Species 27 of 55 - Species 34 - 10/07/2015 15:40:46 -## Species 28 of 55 - Species 35 - 10/07/2015 15:40:46 -## Species 29 of 55 - Species 36 - 10/07/2015 15:40:46 -## Species 30 of 55 - Species 37 - 10/07/2015 15:40:46 -## Species 31 of 55 - Species 38 - 10/07/2015 15:40:46 -## Species 32 of 55 - Species 39 - 10/07/2015 15:40:46 -## Species 33 of 55 - Species 4 - 10/07/2015 15:40:46 -## Species 34 of 55 - Species 40 - 10/07/2015 15:40:46 -## Species 35 of 55 - Species 41 - 10/07/2015 15:40:46 -## Species 36 of 55 - Species 42 - 10/07/2015 15:40:46 -## Species 37 of 55 - Species 44 - 10/07/2015 15:40:46 -## Species 38 of 55 - Species 45 - 10/07/2015 15:40:46 -## Species 39 of 55 - Species 46 - 10/07/2015 15:40:46 -## Species 40 of 55 - Species 47 - 10/07/2015 15:40:46 -## Species 41 of 55 - Species 48 - 10/07/2015 15:40:46 -## Species 42 of 55 - Species 49 - 10/07/2015 15:40:46 -## Species 43 of 55 - Species 5 - 10/07/2015 15:40:46 -## Species 44 of 55 - Species 50 - 10/07/2015 15:40:46 -## Species 45 of 55 - Species 51 - 10/07/2015 15:40:46 -## Species 46 of 55 - Species 52 - 10/07/2015 15:40:46 -## Species 47 of 55 - Species 54 - 10/07/2015 15:40:46 -## Species 48 of 55 - Species 55 - 10/07/2015 15:40:46 -## Species 49 of 55 - Species 56 - 10/07/2015 15:40:46 -## Species 50 of 55 - Species 57 - 10/07/2015 15:40:46 -## Species 51 of 55 - Species 6 - 10/07/2015 15:40:46 -## Species 52 of 55 - Species 66 - 10/07/2015 15:40:46 -## Species 53 of 55 - Species 7 - 10/07/2015 15:40:46 -## Species 54 of 55 - Species 8 - 10/07/2015 15:40:46 -## Species 55 of 55 - Species 9 - 10/07/2015 15:40:46 -## [1] "frescalo complete" -``` - -It is worth noting the console output here. We get a warning telling us that I have some data from a site that is not in my weights file, so I might want to investigate that and add the site to my weights file. We will ignore it for now. The second warning tells us that the `sinkdir` that we gave already has frescalo output in it. The function has got around this by renaming the output. We finally got a long list of all the species as their data were compiled internally. - -Now for the plotting. - - -```r -plot(unicorn_results) -``` - -![](unnamed-chunk-31-1.png) - -Each panel of the plot gives different information for your results. The top right plot shows the observed number of species at each site (given in `unicorn_results$stat$No_spp`), this can be contrasted with the top left plot which gives the estimated number of species after accounting for recording effort (given in `unicorn_results$stat$Spnum_out`). Recording effort is presented in the bottom left panel - low values of alpha (white) show areas of high recording effort (given in `unicorn_results$stat$Alpha`), and a summary of the species trends are given in the bottom right (given in `unicorn_results$lm_stats`). In this case there is a skew towards species increasing, however some of these may be non-significant, this could be explored in more detail be referring to `unicorn_results$lm_stats`. - -# References - -1. [Hill, M.O. (2012) Local frequency as a key to interpreting species occurrence data when recording effort is not known. Methods Ecol. Evol. 3, 195-205](http://onlinelibrary.wiley.com/doi/10.1111/j.2041-210X.2011.00146.x/suppinfo) -2. [Isaac, N.J.B. et al. (2014) Statistics for citizen science: extracting signals of change from noisy ecological data. Methods Ecol. Evol. 5, 1052-1060](http://onlinelibrary.wiley.com/doi/10.1111/2041-210X.12254/abstract) -3. [Roy, H.E. et al. (2012) Invasive alien predator causes rapid declines of native European ladybirds. Divers. Distrib. 18, 717-725](http://onlinelibrary.wiley.com/doi/10.1111/j.1472-4642.2012.00883.x/abstract) -4. [Telfer, M.G. et al. (2002) A general method for measuring relative change in range size from biological atlas data. Biol. Conserv. 107, 99-109](http://www.sciencedirect.com/science/article/pii/S0006320702000502#) -5. [Thomas, J.A. et al. (2015) Recent trends in UK insects that inhabit early successional stages of ecosystems. Biol. J. Linn. Soc. 115, 636-646](http://onlinelibrary.wiley.com/doi/10.1111/j.1472-4642.2012.00883.x/abstract) diff --git a/docs/articles/sparta_vignette.pdf b/docs/articles/sparta_vignette.pdf deleted file mode 100644 index 7567628..0000000 Binary files a/docs/articles/sparta_vignette.pdf and /dev/null differ diff --git a/docs/articles/sparta_vignette_cache/html/__packages b/docs/articles/sparta_vignette_cache/html/__packages deleted file mode 100644 index 0f0bcd5..0000000 --- a/docs/articles/sparta_vignette_cache/html/__packages +++ /dev/null @@ -1,6 +0,0 @@ -base -Matrix -lme4 -sparta -snow -snowfall diff --git a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-11_1a9c6297bc0c8f444cfee8bb6f43b2c7.RData b/docs/articles/sparta_vignette_cache/html/unnamed-chunk-11_1a9c6297bc0c8f444cfee8bb6f43b2c7.RData deleted file mode 100644 index 084378e..0000000 Binary files a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-11_1a9c6297bc0c8f444cfee8bb6f43b2c7.RData and /dev/null differ diff --git a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-11_1a9c6297bc0c8f444cfee8bb6f43b2c7.rdb b/docs/articles/sparta_vignette_cache/html/unnamed-chunk-11_1a9c6297bc0c8f444cfee8bb6f43b2c7.rdb deleted file mode 100644 index 9d00a34..0000000 Binary files a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-11_1a9c6297bc0c8f444cfee8bb6f43b2c7.rdb and /dev/null differ diff --git a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-11_1a9c6297bc0c8f444cfee8bb6f43b2c7.rdx b/docs/articles/sparta_vignette_cache/html/unnamed-chunk-11_1a9c6297bc0c8f444cfee8bb6f43b2c7.rdx deleted file mode 100644 index b0f679d..0000000 Binary files a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-11_1a9c6297bc0c8f444cfee8bb6f43b2c7.rdx and /dev/null differ diff --git a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-12_e52213877057eef4420e77b4a86d9f4c.RData b/docs/articles/sparta_vignette_cache/html/unnamed-chunk-12_e52213877057eef4420e77b4a86d9f4c.RData deleted file mode 100644 index 00ed24b..0000000 Binary files a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-12_e52213877057eef4420e77b4a86d9f4c.RData and /dev/null differ diff --git a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-12_e52213877057eef4420e77b4a86d9f4c.rdb b/docs/articles/sparta_vignette_cache/html/unnamed-chunk-12_e52213877057eef4420e77b4a86d9f4c.rdb deleted file mode 100644 index 7b4f1db..0000000 Binary files a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-12_e52213877057eef4420e77b4a86d9f4c.rdb and /dev/null differ diff --git a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-12_e52213877057eef4420e77b4a86d9f4c.rdx b/docs/articles/sparta_vignette_cache/html/unnamed-chunk-12_e52213877057eef4420e77b4a86d9f4c.rdx deleted file mode 100644 index b2fd5e6..0000000 Binary files a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-12_e52213877057eef4420e77b4a86d9f4c.rdx and /dev/null differ diff --git a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-13_675351426c3135a7bbc79248c0f7b24d.RData b/docs/articles/sparta_vignette_cache/html/unnamed-chunk-13_675351426c3135a7bbc79248c0f7b24d.RData deleted file mode 100644 index b2e1fd1..0000000 Binary files a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-13_675351426c3135a7bbc79248c0f7b24d.RData and /dev/null differ diff --git a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-13_675351426c3135a7bbc79248c0f7b24d.rdb b/docs/articles/sparta_vignette_cache/html/unnamed-chunk-13_675351426c3135a7bbc79248c0f7b24d.rdb deleted file mode 100644 index 37760a9..0000000 Binary files a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-13_675351426c3135a7bbc79248c0f7b24d.rdb and /dev/null differ diff --git a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-13_675351426c3135a7bbc79248c0f7b24d.rdx b/docs/articles/sparta_vignette_cache/html/unnamed-chunk-13_675351426c3135a7bbc79248c0f7b24d.rdx deleted file mode 100644 index 44b1a1b..0000000 Binary files a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-13_675351426c3135a7bbc79248c0f7b24d.rdx and /dev/null differ diff --git a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-14_ecba3ed6716dacdfca16454b1fce6689.RData b/docs/articles/sparta_vignette_cache/html/unnamed-chunk-14_ecba3ed6716dacdfca16454b1fce6689.RData deleted file mode 100644 index 2ac5e6e..0000000 Binary files a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-14_ecba3ed6716dacdfca16454b1fce6689.RData and /dev/null differ diff --git a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-14_ecba3ed6716dacdfca16454b1fce6689.rdb b/docs/articles/sparta_vignette_cache/html/unnamed-chunk-14_ecba3ed6716dacdfca16454b1fce6689.rdb deleted file mode 100644 index 25e8171..0000000 Binary files a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-14_ecba3ed6716dacdfca16454b1fce6689.rdb and /dev/null differ diff --git a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-14_ecba3ed6716dacdfca16454b1fce6689.rdx b/docs/articles/sparta_vignette_cache/html/unnamed-chunk-14_ecba3ed6716dacdfca16454b1fce6689.rdx deleted file mode 100644 index 30cd0d4..0000000 Binary files a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-14_ecba3ed6716dacdfca16454b1fce6689.rdx and /dev/null differ diff --git a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-15_7a7ce74477b7805ed93712560386753c.RData b/docs/articles/sparta_vignette_cache/html/unnamed-chunk-15_7a7ce74477b7805ed93712560386753c.RData deleted file mode 100644 index bc1a099..0000000 Binary files a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-15_7a7ce74477b7805ed93712560386753c.RData and /dev/null differ diff --git a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-15_7a7ce74477b7805ed93712560386753c.rdb b/docs/articles/sparta_vignette_cache/html/unnamed-chunk-15_7a7ce74477b7805ed93712560386753c.rdb deleted file mode 100644 index 0203172..0000000 Binary files a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-15_7a7ce74477b7805ed93712560386753c.rdb and /dev/null differ diff --git a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-15_7a7ce74477b7805ed93712560386753c.rdx b/docs/articles/sparta_vignette_cache/html/unnamed-chunk-15_7a7ce74477b7805ed93712560386753c.rdx deleted file mode 100644 index 950b156..0000000 Binary files a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-15_7a7ce74477b7805ed93712560386753c.rdx and /dev/null differ diff --git a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-16_78bfd06cc23015e530ed9ea2ce62250c.RData b/docs/articles/sparta_vignette_cache/html/unnamed-chunk-16_78bfd06cc23015e530ed9ea2ce62250c.RData deleted file mode 100644 index 0d98912..0000000 Binary files a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-16_78bfd06cc23015e530ed9ea2ce62250c.RData and /dev/null differ diff --git a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-16_78bfd06cc23015e530ed9ea2ce62250c.rdb b/docs/articles/sparta_vignette_cache/html/unnamed-chunk-16_78bfd06cc23015e530ed9ea2ce62250c.rdb deleted file mode 100644 index 5a413eb..0000000 Binary files a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-16_78bfd06cc23015e530ed9ea2ce62250c.rdb and /dev/null differ diff --git a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-16_78bfd06cc23015e530ed9ea2ce62250c.rdx b/docs/articles/sparta_vignette_cache/html/unnamed-chunk-16_78bfd06cc23015e530ed9ea2ce62250c.rdx deleted file mode 100644 index 067db70..0000000 Binary files a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-16_78bfd06cc23015e530ed9ea2ce62250c.rdx and /dev/null differ diff --git a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-17_15659d9be2ee33e7a07a1a68e51d8ae7.RData b/docs/articles/sparta_vignette_cache/html/unnamed-chunk-17_15659d9be2ee33e7a07a1a68e51d8ae7.RData deleted file mode 100644 index 5f89e08..0000000 Binary files a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-17_15659d9be2ee33e7a07a1a68e51d8ae7.RData and /dev/null differ diff --git a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-17_15659d9be2ee33e7a07a1a68e51d8ae7.rdb b/docs/articles/sparta_vignette_cache/html/unnamed-chunk-17_15659d9be2ee33e7a07a1a68e51d8ae7.rdb deleted file mode 100644 index 9bf5846..0000000 Binary files a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-17_15659d9be2ee33e7a07a1a68e51d8ae7.rdb and /dev/null differ diff --git a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-17_15659d9be2ee33e7a07a1a68e51d8ae7.rdx b/docs/articles/sparta_vignette_cache/html/unnamed-chunk-17_15659d9be2ee33e7a07a1a68e51d8ae7.rdx deleted file mode 100644 index 027cdac..0000000 Binary files a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-17_15659d9be2ee33e7a07a1a68e51d8ae7.rdx and /dev/null differ diff --git a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-18_c31160e0e0dee0407b12ccf72d6e3e5c.RData b/docs/articles/sparta_vignette_cache/html/unnamed-chunk-18_c31160e0e0dee0407b12ccf72d6e3e5c.RData deleted file mode 100644 index 7f8f61e..0000000 Binary files a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-18_c31160e0e0dee0407b12ccf72d6e3e5c.RData and /dev/null differ diff --git a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-18_c31160e0e0dee0407b12ccf72d6e3e5c.rdb b/docs/articles/sparta_vignette_cache/html/unnamed-chunk-18_c31160e0e0dee0407b12ccf72d6e3e5c.rdb deleted file mode 100644 index 7bc3fd0..0000000 Binary files a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-18_c31160e0e0dee0407b12ccf72d6e3e5c.rdb and /dev/null differ diff --git a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-18_c31160e0e0dee0407b12ccf72d6e3e5c.rdx b/docs/articles/sparta_vignette_cache/html/unnamed-chunk-18_c31160e0e0dee0407b12ccf72d6e3e5c.rdx deleted file mode 100644 index b80f0ec..0000000 Binary files a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-18_c31160e0e0dee0407b12ccf72d6e3e5c.rdx and /dev/null differ diff --git a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-19_3f13a8e5037a5729f9c7ac00465902c0.RData b/docs/articles/sparta_vignette_cache/html/unnamed-chunk-19_3f13a8e5037a5729f9c7ac00465902c0.RData deleted file mode 100644 index a9f2736..0000000 Binary files a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-19_3f13a8e5037a5729f9c7ac00465902c0.RData and /dev/null differ diff --git a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-19_3f13a8e5037a5729f9c7ac00465902c0.rdb b/docs/articles/sparta_vignette_cache/html/unnamed-chunk-19_3f13a8e5037a5729f9c7ac00465902c0.rdb deleted file mode 100644 index 2ab9eb9..0000000 Binary files a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-19_3f13a8e5037a5729f9c7ac00465902c0.rdb and /dev/null differ diff --git a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-19_3f13a8e5037a5729f9c7ac00465902c0.rdx b/docs/articles/sparta_vignette_cache/html/unnamed-chunk-19_3f13a8e5037a5729f9c7ac00465902c0.rdx deleted file mode 100644 index 0d8aca4..0000000 Binary files a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-19_3f13a8e5037a5729f9c7ac00465902c0.rdx and /dev/null differ diff --git a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-22_9ebbe1d47a39ae365d6509559ea68420.RData b/docs/articles/sparta_vignette_cache/html/unnamed-chunk-22_9ebbe1d47a39ae365d6509559ea68420.RData deleted file mode 100644 index 3280a79..0000000 Binary files a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-22_9ebbe1d47a39ae365d6509559ea68420.RData and /dev/null differ diff --git a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-22_9ebbe1d47a39ae365d6509559ea68420.rdb b/docs/articles/sparta_vignette_cache/html/unnamed-chunk-22_9ebbe1d47a39ae365d6509559ea68420.rdb deleted file mode 100644 index a6003cd..0000000 Binary files a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-22_9ebbe1d47a39ae365d6509559ea68420.rdb and /dev/null differ diff --git a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-22_9ebbe1d47a39ae365d6509559ea68420.rdx b/docs/articles/sparta_vignette_cache/html/unnamed-chunk-22_9ebbe1d47a39ae365d6509559ea68420.rdx deleted file mode 100644 index dafb6c6..0000000 Binary files a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-22_9ebbe1d47a39ae365d6509559ea68420.rdx and /dev/null differ diff --git a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-26_66e9cc89342d4345d07d74bac09583a6.RData b/docs/articles/sparta_vignette_cache/html/unnamed-chunk-26_66e9cc89342d4345d07d74bac09583a6.RData deleted file mode 100644 index c3f90d2..0000000 Binary files a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-26_66e9cc89342d4345d07d74bac09583a6.RData and /dev/null differ diff --git a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-26_66e9cc89342d4345d07d74bac09583a6.rdb b/docs/articles/sparta_vignette_cache/html/unnamed-chunk-26_66e9cc89342d4345d07d74bac09583a6.rdb deleted file mode 100644 index 4a6fb8d..0000000 Binary files a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-26_66e9cc89342d4345d07d74bac09583a6.rdb and /dev/null differ diff --git a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-26_66e9cc89342d4345d07d74bac09583a6.rdx b/docs/articles/sparta_vignette_cache/html/unnamed-chunk-26_66e9cc89342d4345d07d74bac09583a6.rdx deleted file mode 100644 index e0c4928..0000000 Binary files a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-26_66e9cc89342d4345d07d74bac09583a6.rdx and /dev/null differ diff --git a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-27_b6277a3823397bf594a3caca2060c5a7.RData b/docs/articles/sparta_vignette_cache/html/unnamed-chunk-27_b6277a3823397bf594a3caca2060c5a7.RData deleted file mode 100644 index 5236646..0000000 Binary files a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-27_b6277a3823397bf594a3caca2060c5a7.RData and /dev/null differ diff --git a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-27_b6277a3823397bf594a3caca2060c5a7.rdb b/docs/articles/sparta_vignette_cache/html/unnamed-chunk-27_b6277a3823397bf594a3caca2060c5a7.rdb deleted file mode 100644 index a3546e0..0000000 Binary files a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-27_b6277a3823397bf594a3caca2060c5a7.rdb and /dev/null differ diff --git a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-27_b6277a3823397bf594a3caca2060c5a7.rdx b/docs/articles/sparta_vignette_cache/html/unnamed-chunk-27_b6277a3823397bf594a3caca2060c5a7.rdx deleted file mode 100644 index d8e7b85..0000000 Binary files a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-27_b6277a3823397bf594a3caca2060c5a7.rdx and /dev/null differ diff --git a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-29_e38993a7072d20d8f3836c4a4fa53d18.RData b/docs/articles/sparta_vignette_cache/html/unnamed-chunk-29_e38993a7072d20d8f3836c4a4fa53d18.RData deleted file mode 100644 index 60320cf..0000000 Binary files a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-29_e38993a7072d20d8f3836c4a4fa53d18.RData and /dev/null differ diff --git a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-29_e38993a7072d20d8f3836c4a4fa53d18.rdb b/docs/articles/sparta_vignette_cache/html/unnamed-chunk-29_e38993a7072d20d8f3836c4a4fa53d18.rdb deleted file mode 100644 index e69de29..0000000 diff --git a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-29_e38993a7072d20d8f3836c4a4fa53d18.rdx b/docs/articles/sparta_vignette_cache/html/unnamed-chunk-29_e38993a7072d20d8f3836c4a4fa53d18.rdx deleted file mode 100644 index c09d9a3..0000000 Binary files a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-29_e38993a7072d20d8f3836c4a4fa53d18.rdx and /dev/null differ diff --git a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-30_ff28f3cc9a3aa20da923eaf734c1cd91.RData b/docs/articles/sparta_vignette_cache/html/unnamed-chunk-30_ff28f3cc9a3aa20da923eaf734c1cd91.RData deleted file mode 100644 index 29fe480..0000000 Binary files a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-30_ff28f3cc9a3aa20da923eaf734c1cd91.RData and /dev/null differ diff --git a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-30_ff28f3cc9a3aa20da923eaf734c1cd91.rdb b/docs/articles/sparta_vignette_cache/html/unnamed-chunk-30_ff28f3cc9a3aa20da923eaf734c1cd91.rdb deleted file mode 100644 index 22d5e0e..0000000 Binary files a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-30_ff28f3cc9a3aa20da923eaf734c1cd91.rdb and /dev/null differ diff --git a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-30_ff28f3cc9a3aa20da923eaf734c1cd91.rdx b/docs/articles/sparta_vignette_cache/html/unnamed-chunk-30_ff28f3cc9a3aa20da923eaf734c1cd91.rdx deleted file mode 100644 index bac53f3..0000000 Binary files a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-30_ff28f3cc9a3aa20da923eaf734c1cd91.rdx and /dev/null differ diff --git a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-31_7b62bc247c80c78e64ce6f14dd4321fa.RData b/docs/articles/sparta_vignette_cache/html/unnamed-chunk-31_7b62bc247c80c78e64ce6f14dd4321fa.RData deleted file mode 100644 index 71c3c66..0000000 Binary files a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-31_7b62bc247c80c78e64ce6f14dd4321fa.RData and /dev/null differ diff --git a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-31_7b62bc247c80c78e64ce6f14dd4321fa.rdb b/docs/articles/sparta_vignette_cache/html/unnamed-chunk-31_7b62bc247c80c78e64ce6f14dd4321fa.rdb deleted file mode 100644 index e69de29..0000000 diff --git a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-31_7b62bc247c80c78e64ce6f14dd4321fa.rdx b/docs/articles/sparta_vignette_cache/html/unnamed-chunk-31_7b62bc247c80c78e64ce6f14dd4321fa.rdx deleted file mode 100644 index c09d9a3..0000000 Binary files a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-31_7b62bc247c80c78e64ce6f14dd4321fa.rdx and /dev/null differ diff --git a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-4_03b4e3cb0db883b11803284ae8b05f19.RData b/docs/articles/sparta_vignette_cache/html/unnamed-chunk-4_03b4e3cb0db883b11803284ae8b05f19.RData deleted file mode 100644 index 87e8bfc..0000000 Binary files a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-4_03b4e3cb0db883b11803284ae8b05f19.RData and /dev/null differ diff --git a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-4_03b4e3cb0db883b11803284ae8b05f19.rdb b/docs/articles/sparta_vignette_cache/html/unnamed-chunk-4_03b4e3cb0db883b11803284ae8b05f19.rdb deleted file mode 100644 index 7c27611..0000000 Binary files a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-4_03b4e3cb0db883b11803284ae8b05f19.rdb and /dev/null differ diff --git a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-4_03b4e3cb0db883b11803284ae8b05f19.rdx b/docs/articles/sparta_vignette_cache/html/unnamed-chunk-4_03b4e3cb0db883b11803284ae8b05f19.rdx deleted file mode 100644 index 3ece5d6..0000000 Binary files a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-4_03b4e3cb0db883b11803284ae8b05f19.rdx and /dev/null differ diff --git a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-5_8e93bebdf7cbcbb7c57f668999b16d32.RData b/docs/articles/sparta_vignette_cache/html/unnamed-chunk-5_8e93bebdf7cbcbb7c57f668999b16d32.RData deleted file mode 100644 index 90cdc4e..0000000 Binary files a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-5_8e93bebdf7cbcbb7c57f668999b16d32.RData and /dev/null differ diff --git a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-5_8e93bebdf7cbcbb7c57f668999b16d32.rdb b/docs/articles/sparta_vignette_cache/html/unnamed-chunk-5_8e93bebdf7cbcbb7c57f668999b16d32.rdb deleted file mode 100644 index be6be29..0000000 Binary files a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-5_8e93bebdf7cbcbb7c57f668999b16d32.rdb and /dev/null differ diff --git a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-5_8e93bebdf7cbcbb7c57f668999b16d32.rdx b/docs/articles/sparta_vignette_cache/html/unnamed-chunk-5_8e93bebdf7cbcbb7c57f668999b16d32.rdx deleted file mode 100644 index 0373f1b..0000000 Binary files a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-5_8e93bebdf7cbcbb7c57f668999b16d32.rdx and /dev/null differ diff --git a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-7_24769790a9e05397d78f5d2ad4b83c8b.RData b/docs/articles/sparta_vignette_cache/html/unnamed-chunk-7_24769790a9e05397d78f5d2ad4b83c8b.RData deleted file mode 100644 index 786413f..0000000 Binary files a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-7_24769790a9e05397d78f5d2ad4b83c8b.RData and /dev/null differ diff --git a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-7_24769790a9e05397d78f5d2ad4b83c8b.rdb b/docs/articles/sparta_vignette_cache/html/unnamed-chunk-7_24769790a9e05397d78f5d2ad4b83c8b.rdb deleted file mode 100644 index c4660ff..0000000 Binary files a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-7_24769790a9e05397d78f5d2ad4b83c8b.rdb and /dev/null differ diff --git a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-7_24769790a9e05397d78f5d2ad4b83c8b.rdx b/docs/articles/sparta_vignette_cache/html/unnamed-chunk-7_24769790a9e05397d78f5d2ad4b83c8b.rdx deleted file mode 100644 index 94caa95..0000000 Binary files a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-7_24769790a9e05397d78f5d2ad4b83c8b.rdx and /dev/null differ diff --git a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-8_a66c7718e30a881cd6532cf0464667ff.RData b/docs/articles/sparta_vignette_cache/html/unnamed-chunk-8_a66c7718e30a881cd6532cf0464667ff.RData deleted file mode 100644 index a05628a..0000000 Binary files a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-8_a66c7718e30a881cd6532cf0464667ff.RData and /dev/null differ diff --git a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-8_a66c7718e30a881cd6532cf0464667ff.rdb b/docs/articles/sparta_vignette_cache/html/unnamed-chunk-8_a66c7718e30a881cd6532cf0464667ff.rdb deleted file mode 100644 index 1f53e0e..0000000 Binary files a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-8_a66c7718e30a881cd6532cf0464667ff.rdb and /dev/null differ diff --git a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-8_a66c7718e30a881cd6532cf0464667ff.rdx b/docs/articles/sparta_vignette_cache/html/unnamed-chunk-8_a66c7718e30a881cd6532cf0464667ff.rdx deleted file mode 100644 index 4083bb8..0000000 Binary files a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-8_a66c7718e30a881cd6532cf0464667ff.rdx and /dev/null differ diff --git a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-9_7ff8c20eb0cf725336de5a9d9bc04916.RData b/docs/articles/sparta_vignette_cache/html/unnamed-chunk-9_7ff8c20eb0cf725336de5a9d9bc04916.RData deleted file mode 100644 index 028c9fa..0000000 Binary files a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-9_7ff8c20eb0cf725336de5a9d9bc04916.RData and /dev/null differ diff --git a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-9_7ff8c20eb0cf725336de5a9d9bc04916.rdb b/docs/articles/sparta_vignette_cache/html/unnamed-chunk-9_7ff8c20eb0cf725336de5a9d9bc04916.rdb deleted file mode 100644 index abbbf3b..0000000 Binary files a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-9_7ff8c20eb0cf725336de5a9d9bc04916.rdb and /dev/null differ diff --git a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-9_7ff8c20eb0cf725336de5a9d9bc04916.rdx b/docs/articles/sparta_vignette_cache/html/unnamed-chunk-9_7ff8c20eb0cf725336de5a9d9bc04916.rdx deleted file mode 100644 index 255e7c0..0000000 Binary files a/docs/articles/sparta_vignette_cache/html/unnamed-chunk-9_7ff8c20eb0cf725336de5a9d9bc04916.rdx and /dev/null differ diff --git a/docs/articles/sparta_vignette_files/figure-html/unnamed-chunk-15-1.png b/docs/articles/sparta_vignette_files/figure-html/unnamed-chunk-15-1.png deleted file mode 100644 index 5ac9484..0000000 Binary files a/docs/articles/sparta_vignette_files/figure-html/unnamed-chunk-15-1.png and /dev/null differ diff --git a/docs/articles/sparta_vignette_files/figure-html/unnamed-chunk-17-1.png b/docs/articles/sparta_vignette_files/figure-html/unnamed-chunk-17-1.png deleted file mode 100644 index 18d5397..0000000 Binary files a/docs/articles/sparta_vignette_files/figure-html/unnamed-chunk-17-1.png and /dev/null differ diff --git a/docs/articles/sparta_vignette_files/figure-html/unnamed-chunk-18-1.png b/docs/articles/sparta_vignette_files/figure-html/unnamed-chunk-18-1.png deleted file mode 100644 index 8cdcfef..0000000 Binary files a/docs/articles/sparta_vignette_files/figure-html/unnamed-chunk-18-1.png and /dev/null differ diff --git a/docs/articles/sparta_vignette_files/figure-html/unnamed-chunk-22-1.png b/docs/articles/sparta_vignette_files/figure-html/unnamed-chunk-22-1.png deleted file mode 100644 index 8cdcfef..0000000 Binary files a/docs/articles/sparta_vignette_files/figure-html/unnamed-chunk-22-1.png and /dev/null differ diff --git a/docs/articles/sparta_vignette_files/figure-html/unnamed-chunk-31-1.png b/docs/articles/sparta_vignette_files/figure-html/unnamed-chunk-31-1.png deleted file mode 100644 index c7467a2..0000000 Binary files a/docs/articles/sparta_vignette_files/figure-html/unnamed-chunk-31-1.png and /dev/null differ diff --git a/docs/articles/sparta_vignette_files/figure-html/unnamed-chunk-4-1.png b/docs/articles/sparta_vignette_files/figure-html/unnamed-chunk-4-1.png deleted file mode 100644 index 59c1965..0000000 Binary files a/docs/articles/sparta_vignette_files/figure-html/unnamed-chunk-4-1.png and /dev/null differ diff --git a/docs/articles/sparta_vignette_files/figure-html/unnamed-chunk-5-1.png b/docs/articles/sparta_vignette_files/figure-html/unnamed-chunk-5-1.png deleted file mode 100644 index 66c8eb5..0000000 Binary files a/docs/articles/sparta_vignette_files/figure-html/unnamed-chunk-5-1.png and /dev/null differ diff --git a/docs/articles/unnamed-chunk-30-1.png b/docs/articles/unnamed-chunk-30-1.png deleted file mode 100644 index 5f27fd0..0000000 Binary files a/docs/articles/unnamed-chunk-30-1.png and /dev/null differ diff --git a/docs/authors.html b/docs/authors.html index c2dd044..5df1a14 100644 --- a/docs/authors.html +++ b/docs/authors.html @@ -1,6 +1,6 @@ - + @@ -9,24 +9,37 @@ Authors • sparta - + - - + + - + + + + + + - - - + + + + + + + + + + - + + @@ -84,35 +103,43 @@ -
-
+
+
  • -

    Tom August. Author, maintainer. +

    Tom August. Author, maintainer. +

    +
  • +
  • +

    Gary Powney. Author.

  • -

    Gary Powney. Author. +

    Charlie Outhwaite. Author.

  • -

    Charlie Outhwaite. Author. +

    Colin Harrower. Author.

  • -

    Colin Harrower. Author. +

    Mark Hill. Author.

  • -

    Mark Hill. Author. +

    Jack Hatfield. Author.

  • -

    Nick Isaac. Author. +

    Francesca Mancini. Author. +

    +
  • +
  • +

    Nick Isaac. Author.

@@ -124,15 +151,17 @@

Authors

-

Site built with pkgdown.

+

Site built with pkgdown 1.3.0.

-
+ + + diff --git a/docs/docsearch.css b/docs/docsearch.css new file mode 100644 index 0000000..e5f1fe1 --- /dev/null +++ b/docs/docsearch.css @@ -0,0 +1,148 @@ +/* Docsearch -------------------------------------------------------------- */ +/* + Source: https://github.com/algolia/docsearch/ + License: MIT +*/ + +.algolia-autocomplete { + display: block; + -webkit-box-flex: 1; + -ms-flex: 1; + flex: 1 +} + +.algolia-autocomplete .ds-dropdown-menu { + width: 100%; + min-width: none; + max-width: none; + padding: .75rem 0; + background-color: #fff; + background-clip: padding-box; + border: 1px solid rgba(0, 0, 0, .1); + box-shadow: 0 .5rem 1rem rgba(0, 0, 0, .175); +} + +@media (min-width:768px) { + .algolia-autocomplete .ds-dropdown-menu { + width: 175% + } +} + +.algolia-autocomplete .ds-dropdown-menu::before { + display: none +} + +.algolia-autocomplete .ds-dropdown-menu [class^=ds-dataset-] { + padding: 0; + background-color: rgb(255,255,255); + border: 0; + max-height: 80vh; +} + +.algolia-autocomplete .ds-dropdown-menu .ds-suggestions { + margin-top: 0 +} + +.algolia-autocomplete .algolia-docsearch-suggestion { + padding: 0; + overflow: visible +} + +.algolia-autocomplete .algolia-docsearch-suggestion--category-header { + padding: .125rem 1rem; + margin-top: 0; + font-size: 1.3em; + font-weight: 500; + color: #00008B; + border-bottom: 0 +} + +.algolia-autocomplete .algolia-docsearch-suggestion--wrapper { + float: none; + padding-top: 0 +} + +.algolia-autocomplete .algolia-docsearch-suggestion--subcategory-column { + float: none; + width: auto; + padding: 0; + text-align: left +} + +.algolia-autocomplete .algolia-docsearch-suggestion--content { + float: none; + width: auto; + padding: 0 +} + +.algolia-autocomplete .algolia-docsearch-suggestion--content::before { + display: none +} + +.algolia-autocomplete .ds-suggestion:not(:first-child) .algolia-docsearch-suggestion--category-header { + padding-top: .75rem; + margin-top: .75rem; + border-top: 1px solid rgba(0, 0, 0, .1) +} + +.algolia-autocomplete .ds-suggestion .algolia-docsearch-suggestion--subcategory-column { + display: block; + padding: .1rem 1rem; + margin-bottom: 0.1; + font-size: 1.0em; + font-weight: 400 + /* display: none */ +} + +.algolia-autocomplete .algolia-docsearch-suggestion--title { + display: block; + padding: .25rem 1rem; + margin-bottom: 0; + font-size: 0.9em; + font-weight: 400 +} + +.algolia-autocomplete .algolia-docsearch-suggestion--text { + padding: 0 1rem .5rem; + margin-top: -.25rem; + font-size: 0.8em; + font-weight: 400; + line-height: 1.25 +} + +.algolia-autocomplete .algolia-docsearch-footer { + width: 110px; + height: 20px; + z-index: 3; + margin-top: 10.66667px; + float: right; + font-size: 0; + line-height: 0; +} + +.algolia-autocomplete .algolia-docsearch-footer--logo { + background-image: url("data:image/svg+xml;utf8,"); + background-repeat: no-repeat; + background-position: 50%; + background-size: 100%; + overflow: hidden; + text-indent: -9000px; + width: 100%; + height: 100%; + display: block; + transform: translate(-8px); +} + +.algolia-autocomplete .algolia-docsearch-suggestion--highlight { + color: #FF8C00; + background: rgba(232, 189, 54, 0.1) +} + + +.algolia-autocomplete .algolia-docsearch-suggestion--text .algolia-docsearch-suggestion--highlight { + box-shadow: inset 0 -2px 0 0 rgba(105, 105, 105, .5) +} + +.algolia-autocomplete .ds-suggestion.ds-cursor .algolia-docsearch-suggestion--content { + background-color: rgba(192, 192, 192, .15) +} diff --git a/docs/docsearch.js b/docs/docsearch.js new file mode 100644 index 0000000..b35504c --- /dev/null +++ b/docs/docsearch.js @@ -0,0 +1,85 @@ +$(function() { + + // register a handler to move the focus to the search bar + // upon pressing shift + "/" (i.e. "?") + $(document).on('keydown', function(e) { + if (e.shiftKey && e.keyCode == 191) { + e.preventDefault(); + $("#search-input").focus(); + } + }); + + $(document).ready(function() { + // do keyword highlighting + /* modified from https://jsfiddle.net/julmot/bL6bb5oo/ */ + var mark = function() { + + var referrer = document.URL ; + var paramKey = "q" ; + + if (referrer.indexOf("?") !== -1) { + var qs = referrer.substr(referrer.indexOf('?') + 1); + var qs_noanchor = qs.split('#')[0]; + var qsa = qs_noanchor.split('&'); + var keyword = ""; + + for (var i = 0; i < qsa.length; i++) { + var currentParam = qsa[i].split('='); + + if (currentParam.length !== 2) { + continue; + } + + if (currentParam[0] == paramKey) { + keyword = decodeURIComponent(currentParam[1].replace(/\+/g, "%20")); + } + } + + if (keyword !== "") { + $(".contents").unmark({ + done: function() { + $(".contents").mark(keyword); + } + }); + } + } + }; + + mark(); + }); +}); + +/* Search term highlighting ------------------------------*/ + +function matchedWords(hit) { + var words = []; + + var hierarchy = hit._highlightResult.hierarchy; + // loop to fetch from lvl0, lvl1, etc. + for (var idx in hierarchy) { + words = words.concat(hierarchy[idx].matchedWords); + } + + var content = hit._highlightResult.content; + if (content) { + words = words.concat(content.matchedWords); + } + + // return unique words + var words_uniq = [...new Set(words)]; + return words_uniq; +} + +function updateHitURL(hit) { + + var words = matchedWords(hit); + var url = ""; + + if (hit.anchor) { + url = hit.url_without_anchor + '?q=' + escape(words.join(" ")) + '#' + hit.anchor; + } else { + url = hit.url + '?q=' + escape(words.join(" ")); + } + + return url; +} diff --git a/docs/index.html b/docs/index.html index 9691683..963e69d 100644 --- a/docs/index.html +++ b/docs/index.html @@ -1,15 +1,21 @@ - + Trend Analysis for Unstructured Data • sparta - - - - + + + + + + + @@ -19,13 +25,18 @@
- +

Sparta banner

Build Status codecov.io

This R package includes methods used to analyse trends in unstructured occurrence datasets and a range of useful functions for mapping such data in the UK. The package is currently under development. Note that frescalo currently uses an .exe compiled only for windows.

-### News

+News

We are in the process of re-writing much of sparta to add in things we learnt from our recent publication (Statistics for citizen science: extracting signals of change from noisy ecological data. 2014. Nick J. B. Isaac, Arco J. van Strien, Tom A. August, Marnix P. de Zeeuw and David B. Roy). Once the re-write is complete the package will go on CRAN.

-### Installation

+Installation

To install the development version of sparta, it’s easiest to use the devtools package:

# install.packages("devtools")
 # NOTE: If you have not installed devtools before you will need to restart you R
 # session before installing to avoid problems
 
-library(devtools)
+library(devtools)
 
 # Some users have reported issues with devtools not correctly installing
 # dependencies. Run the following lines to avoid these issues
-list.of.packages <- c("minqa", "lme4", "gtools", "gtable", "scales",
+list.of.packages <- c("minqa", "lme4", "gtools", "gtable", "scales",
                       "assertthat", "magrittr", "tibble", "stringr")
-new.packages <- list.of.packages[!(list.of.packages %in% installed.packages()[,"Package"])]
-if(length(new.packages)) install.packages(new.packages)
+new.packages <- list.of.packages[!(list.of.packages %in% installed.packages()[,"Package"])]
+if(length(new.packages)) install.packages(new.packages)
 
 # Now install sparta
 install_github('BiologicalRecordsCentre/sparta')
 
 # Load sparta
-library(sparta)
+library(sparta)

If you have difficulties installing sparta using this method try updating your version of R to the most up-to-date version available. If you still have problems please contact us or use the issues page.

-### Vignette/Tutorial

+Vignette/Tutorial

We have written a vignette to support the package which can view here

PLEASE NOTE THAT SINCE THIS PACKAGE IS IN DEVELOPMENT THE STRUCTURE AND FUNCTIONALITY OF THE PACKAGE ARE LIKELY TO CHANGE OVER TIME. WE WILL TRY TO KEEP THIS FRONT PAGE AND TUTORIALS UP TO DATE SO THAT IT WORKS WITH THE CURRENT MASTER VERSION ON GITHUB

- + + diff --git a/docs/jquery.sticky-kit.min.js b/docs/jquery.sticky-kit.min.js deleted file mode 100644 index e2a3c6d..0000000 --- a/docs/jquery.sticky-kit.min.js +++ /dev/null @@ -1,9 +0,0 @@ -/* - Sticky-kit v1.1.2 | WTFPL | Leaf Corcoran 2015 | http://leafo.net -*/ -(function(){var b,f;b=this.jQuery||window.jQuery;f=b(window);b.fn.stick_in_parent=function(d){var A,w,J,n,B,K,p,q,k,E,t;null==d&&(d={});t=d.sticky_class;B=d.inner_scrolling;E=d.recalc_every;k=d.parent;q=d.offset_top;p=d.spacer;w=d.bottoming;null==q&&(q=0);null==k&&(k=void 0);null==B&&(B=!0);null==t&&(t="is_stuck");A=b(document);null==w&&(w=!0);J=function(a,d,n,C,F,u,r,G){var v,H,m,D,I,c,g,x,y,z,h,l;if(!a.data("sticky_kit")){a.data("sticky_kit",!0);I=A.height();g=a.parent();null!=k&&(g=g.closest(k)); -if(!g.length)throw"failed to find stick parent";v=m=!1;(h=null!=p?p&&a.closest(p):b("
"))&&h.css("position",a.css("position"));x=function(){var c,f,e;if(!G&&(I=A.height(),c=parseInt(g.css("border-top-width"),10),f=parseInt(g.css("padding-top"),10),d=parseInt(g.css("padding-bottom"),10),n=g.offset().top+c+f,C=g.height(),m&&(v=m=!1,null==p&&(a.insertAfter(h),h.detach()),a.css({position:"",top:"",width:"",bottom:""}).removeClass(t),e=!0),F=a.offset().top-(parseInt(a.css("margin-top"),10)||0)-q, -u=a.outerHeight(!0),r=a.css("float"),h&&h.css({width:a.outerWidth(!0),height:u,display:a.css("display"),"vertical-align":a.css("vertical-align"),"float":r}),e))return l()};x();if(u!==C)return D=void 0,c=q,z=E,l=function(){var b,l,e,k;if(!G&&(e=!1,null!=z&&(--z,0>=z&&(z=E,x(),e=!0)),e||A.height()===I||x(),e=f.scrollTop(),null!=D&&(l=e-D),D=e,m?(w&&(k=e+u+c>C+n,v&&!k&&(v=!1,a.css({position:"fixed",bottom:"",top:c}).trigger("sticky_kit:unbottom"))),eb&&!v&&(c-=l,c=Math.max(b-u,c),c=Math.min(q,c),m&&a.css({top:c+"px"})))):e>F&&(m=!0,b={position:"fixed",top:c},b.width="border-box"===a.css("box-sizing")?a.outerWidth()+"px":a.width()+"px",a.css(b).addClass(t),null==p&&(a.after(h),"left"!==r&&"right"!==r||h.append(a)),a.trigger("sticky_kit:stick")),m&&w&&(null==k&&(k=e+u+c>C+n),!v&&k)))return v=!0,"static"===g.css("position")&&g.css({position:"relative"}), -a.css({position:"absolute",bottom:d,top:"auto"}).trigger("sticky_kit:bottom")},y=function(){x();return l()},H=function(){G=!0;f.off("touchmove",l);f.off("scroll",l);f.off("resize",y);b(document.body).off("sticky_kit:recalc",y);a.off("sticky_kit:detach",H);a.removeData("sticky_kit");a.css({position:"",bottom:"",top:"",width:""});g.position("position","");if(m)return null==p&&("left"!==r&&"right"!==r||a.insertAfter(h),h.remove()),a.removeClass(t)},f.on("touchmove",l),f.on("scroll",l),f.on("resize", -y),b(document.body).on("sticky_kit:recalc",y),a.on("sticky_kit:detach",H),setTimeout(l,0)}};n=0;for(K=this.length;n body > .container + * .Site-content -> body > .container .row + * .footer -> footer + * + * Key idea seems to be to ensure that .container and __all its parents__ + * have height set to 100% + * + */ + +html, body { + height: 100%; +} + body > .container { display: flex; - padding-top: 60px; - min-height: calc(100vh); + height: 100%; flex-direction: column; + + padding-top: 60px; } body > .container .row { - flex: 1; + flex: 1 0 auto; } footer { @@ -16,6 +35,7 @@ footer { border-top: 1px solid #e5e5e5; color: #666; display: flex; + flex-shrink: 0; } footer p { margin-bottom: 0; @@ -38,6 +58,17 @@ img { max-width: 100%; } +/* Fix bug in bootstrap (only seen in firefox) */ +summary { + display: list-item; +} + +/* Typographic tweaking ---------------------------------*/ + +.contents .page-header { + margin-top: calc(-60px + 1em); +} + /* Section anchors ---------------------------------*/ a.anchor { @@ -68,7 +99,7 @@ a.anchor { .contents h1, .contents h2, .contents h3, .contents h4 { padding-top: 60px; - margin-top: -60px; + margin-top: -40px; } /* Static header placement on mobile devices */ @@ -100,16 +131,19 @@ a.anchor { margin-bottom: 0.5em; } +.orcid { + height: 16px; + vertical-align: middle; +} + /* Reference index & topics ----------------------------------------------- */ .ref-index th {font-weight: normal;} -.ref-index h2 {font-size: 20px;} .ref-index td {vertical-align: top;} +.ref-index .icon {width: 40px;} .ref-index .alias {width: 40%;} -.ref-index .title {width: 60%;} - -.ref-index .alias {width: 40%;} +.ref-index-icons .alias {width: calc(40% - 40px);} .ref-index .title {width: 60%;} .ref-arguments th {text-align: right; padding-right: 10px;} @@ -137,6 +171,12 @@ pre, code { color: #333; } +pre code { + overflow: auto; + word-wrap: normal; + white-space: pre; +} + pre .img { margin: 5px 0; } @@ -151,6 +191,10 @@ code a, pre a { color: #375f84; } +a.sourceLine:hover { + text-decoration: none; +} + .fl {color: #1514b5;} .fu {color: #000000;} /* function */ .ch,.st {color: #036a07;} /* string */ @@ -161,3 +205,32 @@ code a, pre a { .error { color: orange; font-weight: bolder;} .warning { color: #6A0366; font-weight: bolder;} +/* Clipboard --------------------------*/ + +.hasCopyButton { + position: relative; +} + +.btn-copy-ex { + position: absolute; + right: 0; + top: 0; + visibility: hidden; +} + +.hasCopyButton:hover button.btn-copy-ex { + visibility: visible; +} + +/* mark.js ----------------------------*/ + +mark { + background-color: rgba(255, 255, 51, 0.5); + border-bottom: 2px solid rgba(255, 153, 51, 0.3); + padding: 1px; +} + +/* vertical spacing after htmlwidgets */ +.html-widget { + margin-bottom: 10px; +} diff --git a/docs/pkgdown.js b/docs/pkgdown.js index 4b81713..eb7e83d 100644 --- a/docs/pkgdown.js +++ b/docs/pkgdown.js @@ -1,45 +1,115 @@ -$(function() { - $("#sidebar").stick_in_parent({offset_top: 40}); - $('body').scrollspy({ - target: '#sidebar', - offset: 60 - }); +/* http://gregfranko.com/blog/jquery-best-practices/ */ +(function($) { + $(function() { + + $("#sidebar") + .stick_in_parent({offset_top: 40}) + .on('sticky_kit:bottom', function(e) { + $(this).parent().css('position', 'static'); + }) + .on('sticky_kit:unbottom', function(e) { + $(this).parent().css('position', 'relative'); + }); + + $('body').scrollspy({ + target: '#sidebar', + offset: 60 + }); + + $('[data-toggle="tooltip"]').tooltip(); + + var cur_path = paths(location.pathname); + var links = $("#navbar ul li a"); + var max_length = -1; + var pos = -1; + for (var i = 0; i < links.length; i++) { + if (links[i].getAttribute("href") === "#") + continue; + // Ignore external links + if (links[i].host !== location.host) + continue; + + var nav_path = paths(links[i].pathname); - var cur_path = paths(location.pathname); - $("#navbar ul li a").each(function(index, value) { - if (value.text == "Home") - return; - if (value.getAttribute("href") === "#") - return; - - var path = paths(value.pathname); - if (is_prefix(cur_path, path)) { - // Add class to parent
  • , and enclosing
  • if in dropdown - var menu_anchor = $(value); + var length = prefix_length(nav_path, cur_path); + if (length > max_length) { + max_length = length; + pos = i; + } + } + + // Add class to parent
  • , and enclosing
  • if in dropdown + if (pos >= 0) { + var menu_anchor = $(links[pos]); menu_anchor.parent().addClass("active"); menu_anchor.closest("li.dropdown").addClass("active"); } }); -}); -function paths(pathname) { - var pieces = pathname.split("/"); - pieces.shift(); // always starts with / + function paths(pathname) { + var pieces = pathname.split("/"); + pieces.shift(); // always starts with / + + var end = pieces[pieces.length - 1]; + if (end === "index.html" || end === "") + pieces.pop(); + return(pieces); + } - var end = pieces[pieces.length - 1]; - if (end === "index.html" || end === "") - pieces.pop(); - return(pieces); -} + // Returns -1 if not found + function prefix_length(needle, haystack) { + if (needle.length > haystack.length) + return(-1); -function is_prefix(needle, haystack) { - if (needle.length > haystack.lengh) - return(false); + // Special case for length-0 haystack, since for loop won't run + if (haystack.length === 0) { + return(needle.length === 0 ? 0 : -1); + } - for (var i = 0; i < haystack.length; i++) { - if (needle[i] != haystack[i]) - return(false); + for (var i = 0; i < haystack.length; i++) { + if (needle[i] != haystack[i]) + return(i); + } + + return(haystack.length); + } + + /* Clipboard --------------------------*/ + + function changeTooltipMessage(element, msg) { + var tooltipOriginalTitle=element.getAttribute('data-original-title'); + element.setAttribute('data-original-title', msg); + $(element).tooltip('show'); + element.setAttribute('data-original-title', tooltipOriginalTitle); } - return(true); -} + if(ClipboardJS.isSupported()) { + $(document).ready(function() { + var copyButton = ""; + + $(".examples, div.sourceCode").addClass("hasCopyButton"); + + // Insert copy buttons: + $(copyButton).prependTo(".hasCopyButton"); + + // Initialize tooltips: + $('.btn-copy-ex').tooltip({container: 'body'}); + + // Initialize clipboard: + var clipboardBtnCopies = new ClipboardJS('[data-clipboard-copy]', { + text: function(trigger) { + return trigger.parentNode.textContent; + } + }); + + clipboardBtnCopies.on('success', function(e) { + changeTooltipMessage(e.trigger, 'Copied!'); + e.clearSelection(); + }); + + clipboardBtnCopies.on('error', function() { + changeTooltipMessage(e.trigger,'Press Ctrl+C or Command+C to copy'); + }); + }); + } +})(window.jQuery || window.$) diff --git a/docs/pkgdown.yml b/docs/pkgdown.yml new file mode 100644 index 0000000..f1b42aa --- /dev/null +++ b/docs/pkgdown.yml @@ -0,0 +1,7 @@ +pandoc: 1.19.2.1 +pkgdown: 1.3.0 +pkgdown_sha: ~ +articles: + sparta_vignette: ../../../../../../../../../../../W:/PYWELL_SHARED/Pywell Projects/BRC/Tom + August/R Packages/Trend analyses/sparta/vignettes/sparta_vignette.html + diff --git a/docs/reference/WSS.html b/docs/reference/WSS.html index 4f714d1..286f21c 100644 --- a/docs/reference/WSS.html +++ b/docs/reference/WSS.html @@ -1,32 +1,48 @@ - + -Well sampled sites model — WSS • sparta +Well sampled sites model — WSS • sparta - + - - + + - + + + + + + - - - + + + + + + + + + + + + - + +
  • @@ -84,22 +106,26 @@ -
    +
    +

    This function is a wrapper for siteSelection and reportingRateModel that allows users the run a well sampled sites analysis as in Roy et al (2012).

    +
    WSS(taxa, site, time_period, minL = 2, minTP = 3,
    -  species_to_include = unique(taxa), overdispersion = FALSE,
    +  species_to_include = unique(taxa), overdispersion = FALSE,
       family = "Binomial", verbose = FALSE, print_progress = FALSE)
    -

    Arguments

    +

    Arguments

    @@ -159,7 +185,7 @@

    Value

    sufix (after the ".") gives the parameter of that covariate. number_observations gives the number of visits where the species of interest was observed. If any of the models encountered an error this will be given in the - column error_message.

    + column error_message.

    The data.frame has a number of attributes:

    • intercept_year - The year used for the intercept (i.e. the year whose value is set to 0). Setting the intercept to the median year helps @@ -188,43 +214,43 @@

      Examp nSamples <- 20 # set number of dates # Create somes dates -first <- as.POSIXct(strptime("2003/01/01", "%Y/%m/%d")) -last <- as.POSIXct(strptime(paste(2003+(nyr-1),"/12/31", sep=''), "%Y/%m/%d")) +first <- as.POSIXct(strptime("2003/01/01", "%Y/%m/%d")) +last <- as.POSIXct(strptime(paste(2003+(nyr-1),"/12/31", sep=''), "%Y/%m/%d")) dt <- last-first -rDates <- first + (runif(nSamples)*dt) +rDates <- first + (runif(nSamples)*dt) # taxa are set as random letters -taxa <- sample(letters, size = n, TRUE) +taxa <- sample(letters, size = n, TRUE) # three sites are visited randomly -site <- sample(c('one', 'two', 'three'), size = n, TRUE) +site <- sample(c('one', 'two', 'three'), size = n, TRUE) # the date of visit is selected at random from those created earlier -time_period <- sample(rDates, size = n, TRUE) +time_period <- sample(rDates, size = n, TRUE) # combine this to a dataframe -df <- data.frame(taxa, site, time_period) +df <- data.frame(taxa, site, time_period) results <- WSS(df$taxa, df$site, df$time_period, minL = 4, minTP = 3, - species_to_include = c('a', 'b', 'c'))
      #> Warning: 553 out of 1500 observations will be removed as duplicates
      + species_to_include = c('a', 'b', 'c'))
      #> Warning: 542 out of 1500 observations will be removed as duplicates
      #> boundary (singular) fit: see ?isSingular
      #> boundary (singular) fit: see ?isSingular
      #> boundary (singular) fit: see ?isSingular
      # Look at the results for the first few species -head(results)
      #> species_name intercept.estimate year.estimate intercept.stderror -#> 1 a 0.2425899 0.43036252 0.2888391 -#> 2 b 0.5548483 -0.01486593 0.2776787 -#> 3 c 0.1858200 0.16333584 0.2707320 +head(results)
      #> species_name intercept.estimate year.estimate intercept.stderror +#> 1 a 0.8073388 0.05120794 0.2956410 +#> 2 b 0.6035651 -0.15409135 0.2836280 +#> 3 c 0.4843513 0.10465525 0.2830269 #> year.stderror intercept.zvalue year.zvalue intercept.pvalue year.pvalue -#> 1 0.1493173 0.8398792 2.8822023 0.40097612 0.003949062 -#> 2 0.1285670 1.9981662 -0.1156279 0.04569865 0.907947483 -#> 3 0.1280611 0.6863613 1.2754523 0.49248530 0.202149197 +#> 1 0.1272529 2.730808 0.4024108 0.006317919 0.6873817 +#> 2 0.1252298 2.128017 -1.2304687 0.033335712 0.2185216 +#> 3 0.1221340 1.711326 0.8568889 0.087020923 0.3915063 #> observations -#> 1 36 -#> 2 38 -#> 3 34
      # Look at the attributes of the object returned -attributes(results)
      #> $names +#> 1 41 +#> 2 40 +#> 3 36
      # Look at the attributes of the object returned +attributes(results)
      #> $names #> [1] "species_name" "intercept.estimate" "year.estimate" #> [4] "intercept.stderror" "year.stderror" "intercept.zvalue" #> [7] "year.zvalue" "intercept.pvalue" "year.pvalue" @@ -275,15 +301,17 @@

      Contents

      -

      Site built with pkgdown.

      +

      Site built with pkgdown 1.3.0.

      -
      + + + diff --git a/docs/reference/createWeights.html b/docs/reference/createWeights.html index 6990fe4..b7ce236 100644 --- a/docs/reference/createWeights.html +++ b/docs/reference/createWeights.html @@ -1,32 +1,50 @@ - + -Create frescalo weights file — createWeights • sparta +Create frescalo weights file — createWeights • sparta - + - - + + - + + + + + + - - - + + + + + + + + + + + + - + + @@ -84,23 +108,27 @@ -
      +
      +

      Create the weights file required to run frescalo, as outlined in (Hill, 2011). For more information on frescalo see frescalo. This function takes a table of geographical distances between sites and a table of numeric data from which to calculate similarity (for example, landcover or abiotic data)

      +
      createWeights(distances, attributes, dist_sub = 200, sim_sub = 100,
      -  normalise = FALSE)
      + normalise = FALSE, verbose = TRUE) -

      Arguments

      +

      Arguments

    @@ -133,6 +161,10 @@

    Ar

    + + + +

    Logical. If TRUE each attribute is divided by its maximum value to produce values between 0 and 1. Default is FALSE

    verbose

    Logical, should progress be printed to console. Defaults to TRUE

    Value

    @@ -149,29 +181,29 @@

    R

    Examples

    # NOT RUN {
    -mySites <- paste('Site_', 1:100, sep = '')
    +mySites <- paste('Site_', 1:100, sep = '')
     
     # Build a table of distances
    -myDistances <- merge(mySites, mySites)
    +myDistances <- merge(mySites, mySites)
     
     # add random distances
    -myDistances$dist <- runif(n = nrow(myDistances), min = 10, max = 10000)
    +myDistances$dist <- runif(n = nrow(myDistances), min = 10, max = 10000)
     
     # to be realistic the distance from a site to itself should be 0
     myDistances$dist[myDistances$x == myDistances$y] <- 0
     
     # Build a table of attributes
    -myHabitatData <- data.frame(site = mySites,
    -                            grassland = runif(length(mySites), 0, 1),
    -                            woodland = runif(length(mySites), 0, 1),
    -                            heathland = runif(length(mySites), 0, 1),
    -                            urban = runif(length(mySites), 0, 1),
    -                            freshwater = runif(length(mySites), 0, 1))
    +myHabitatData <- data.frame(site = mySites,
    +                            grassland = runif(length(mySites), 0, 1),
    +                            woodland = runif(length(mySites), 0, 1),
    +                            heathland = runif(length(mySites), 0, 1),
    +                            urban = runif(length(mySites), 0, 1),
    +                            freshwater = runif(length(mySites), 0, 1))
     
     # This pretend data is supposed to be proportional cover so lets 
     # make sure each row sums to 1
    -multiples <- apply(myHabitatData[,2:6], 1, sum)
    -for(i in 1:length(mySites)){
    +multiples <- apply(myHabitatData[,2:6], 1, sum)
    +for(i in 1:length(mySites)){
       myHabitatData[i,2:6] <- myHabitatData[i,2:6]/multiples[i]
     }
     
    @@ -199,15 +231,17 @@ 

    Contents

    -

    Site built with pkgdown.

    +

    Site built with pkgdown 1.3.0.

    -
    + + + diff --git a/docs/reference/dataDiagnostics.html b/docs/reference/dataDiagnostics.html index e441029..9e6b6dc 100644 --- a/docs/reference/dataDiagnostics.html +++ b/docs/reference/dataDiagnostics.html @@ -1,32 +1,50 @@ - + -Data Diagnostics — dataDiagnostics • sparta +Data Diagnostics — dataDiagnostics • sparta - + - - + + - + + + + + + - - - + + + + + + + + + + + + - + +
    @@ -84,22 +108,27 @@ -
    +
    +

    This function provides visualisations of how the number of records in the dataset changes over time and how the number of species recorded on a visit changes over time. For each of these an linear model is run to test if there is a significant trend.

    +
    -
    dataDiagnostics(taxa, site, time_period, plot = TRUE, progress_bar = TRUE)
    +
    dataDiagnostics(taxa, site, time_period, plot = TRUE,
    +  progress_bar = TRUE)
    -

    Arguments

    +

    Arguments

    @@ -136,33 +165,33 @@

    Examp
    # NOT RUN {
     ### Diagnostics functions ###
     # Create data
    -n <- 2000 #size of dataset
    +n <- 2000 # size of dataset
     nyr <- 20 # number of years in data
     nSamples <- 200 # set number of dates
     useDates <- TRUE
     
     # Create somes dates
    -first <- as.POSIXct(strptime("2003/01/01", "%Y/%m/%d"))
    -last <- as.POSIXct(strptime(paste(2003+(nyr-1),"/12/31", sep=''), "%Y/%m/%d"))
    +first <- as.POSIXct(strptime("2003/01/01", "%Y/%m/%d"))
    +last <- as.POSIXct(strptime(paste(2003+(nyr-1),"/12/31", sep=''), "%Y/%m/%d"))
     dt <- last-first
    -rDates <- first + (runif(nSamples)*dt)
    +rDates <- first + (runif(nSamples)*dt)
     
     # taxa are set as random letters
    -taxa <- sample(letters, size = n, TRUE)
    +taxa <- sample(letters, size = n, TRUE)
     
     # three sites are visited randomly
    -site <- sample(c('one', 'two', 'three'), size = n, TRUE)
    +site <- sample(c('one', 'two', 'three'), size = n, TRUE)
     
     # the date of visit is selected at random from those created earlier
     if(useDates){
    -  time_period <- sample(rDates, size = n, TRUE)
    +  time_period <- sample(rDates, size = n, TRUE)
     } else {
    -  time_period <- sample(1:nSamples, size = n, TRUE)
    +  time_period <- sample(1:nSamples, size = n, TRUE)
     }
     # Using a date
     dataDiagnostics(taxa, site, time_period)
     # Using a numeric
    -dataDiagnostics(taxa, site, as.numeric(format(time_period, '%Y')))
    +dataDiagnostics(taxa, site, as.numeric(format(time_period, '%Y')))
     # }
    + + + diff --git a/docs/reference/date2timeperiod.html b/docs/reference/date2timeperiod.html index d083122..15ba117 100644 --- a/docs/reference/date2timeperiod.html +++ b/docs/reference/date2timeperiod.html @@ -1,32 +1,48 @@ - + -Assign dates to time periods — date2timeperiod • sparta +Assign dates to time periods — date2timeperiod • sparta - + - - + + - + + + + + + - - - + + + + + + + + + + + + - + + @@ -84,20 +106,24 @@ -
    +
    +

    This function assigns dates to timeperiods a nessecary step for methods that use time periods rather than dates, such as Frescalo and Telfer

    +
    date2timeperiod(Date, time_periods)
    -

    Arguments

    +

    Arguments

    @@ -129,19 +155,19 @@

    Examp nSites <- 50 # set number of sites # Create somes dates -first <- as.Date(strptime("1980/01/01", "%Y/%m/%d")) -last <- as.Date(strptime(paste(1980+(nyr-1),"/12/31", sep=''), "%Y/%m/%d")) +first <- as.Date(strptime("1980/01/01", "%Y/%m/%d")) +last <- as.Date(strptime(paste(1980+(nyr-1),"/12/31", sep=''), "%Y/%m/%d")) dt <- last-first -Date <- first + (runif(nSamples)*dt) +Date <- first + (runif(nSamples)*dt) # Create time periods dataframe -time_periods <- data.frame(start=c(1980,1990),end=c(1989,1999)) +time_periods <- data.frame(start=c(1980,1990),end=c(1989,1999)) # Create time periods column using a vector tps <- date2timeperiod(Date = Date, time_periods = time_periods) # Create time periods column using a data.frame -tps <- date2timeperiod(Date = data.frame(start = Date, end = (Date+100)), +tps <- date2timeperiod(Date = data.frame(start = Date, end = (Date+100)), time_periods = time_periods) + + + diff --git a/docs/reference/detection_phenology.html b/docs/reference/detection_phenology.html new file mode 100644 index 0000000..4b79e80 --- /dev/null +++ b/docs/reference/detection_phenology.html @@ -0,0 +1,188 @@ + + + + + + + + +Diagnostics for the detectability with respect to Julian Date — detection_phenology • sparta + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + +
    + +

    Creates a plot of detectability over the season and calculates some simple statistics

    + +
    + +
    detection_phenology(model, spname = NULL, bins = 12)
    + +

    Arguments

    +

    + + + + + + + + + + + + + +
    model

    a fitted sparta model of class OccDet.

    spname

    optional name of the species (used for plotting)

    bins

    number of points to estimate across the year. Defaults to 12

    + +

    Value

    + +

    Some numbers.

    + +

    Details

    + +

    Takes a object of OccDet fitted with the jul_date option

    +

    Calculate the phenology of detection and produces a plot of detectability over time for the reference data type.

    + +

    References

    + +

    van Strien, A.J., Termaat, T., Groenendijk, D., Mensing, V. & Kéry, M. (2010) + Site-occupancy models may offer new opportunities for dragonfly monitoring based on daily species lists. + Basic and Applied Ecology, 11, 495–503.

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown 1.3.0.

    +
    +
    +
    + + + + + + diff --git a/docs/reference/formatOccData.html b/docs/reference/formatOccData.html index 867e398..db31475 100644 --- a/docs/reference/formatOccData.html +++ b/docs/reference/formatOccData.html @@ -1,32 +1,49 @@ - + -Format data for Occupancy detection models — formatOccData • sparta +Format data for Occupancy detection models — formatOccData • sparta - + - - + + - + + + + + + - - - + + + + + + + + + + + + - + + @@ -84,21 +107,26 @@ -
    +
    +

    This takes occurrene data in the form of a vector of taxa names, locations -and time_period (usually a date) and converts them into the form needed for +and survey (usually a date) and converts them into the form needed for occupancy models (see value section)

    +
    -
    formatOccData(taxa, site, time_period, includeJDay = FALSE)
    +
    formatOccData(taxa, site, survey, replicate = NULL,
    +  closure_period = NULL, includeJDay = FALSE)
    -

    Arguments

    +

    Arguments

    @@ -110,14 +138,23 @@

    Ar

    - - + + + + + + + + + + +occDetData object.

    A character vector of site names, as long as the number of observations.

    time_period

    A numeric vector of user defined time periods, or a date vector, -as long as the number of observations.

    survey

    A vector as long as the number of observations. +This must be a Date if either closure_period is not supplied or if includeJDay = TRUE

    replicate

    An optional vector to identify replicate samples (visits) per survey. Need not be globally unique (e.g can be 1, 2, .. n within surveys)

    closure_period

    An optional vector of integers specifying the closure period. +If FALSE then closure_period will be extracted as the year from the survey.

    includeJDay

    Logical. If TRUE a Julian day column is returned in the -occDetData object

    @@ -128,14 +165,17 @@

    Value

    the following columns. Values in taxa columns are either TRUE or FALSE depending on whether they were observed on that visit. The second element ('occDetData') is a dataframe giving the site, list length (the number of - species observed on a visit) and year for each visit. Optionally this also includes - a Julian Day column

    + species observed on a visit) and year (or time period) for each visit. Optionally this also includes + a Julian Day column, centered on 1 July.

    References

    Isaac, N.J.B., van Strien, A.J., August, T.A., de Zeeuw, M.P. and Roy, D.B. (2014). Statistics for citizen science: extracting signals of change from noisy ecological data. - Methods in Ecology and Evolution, 5 (10), 1052-1060.

    + Methods in Ecology and Evolution, 5 (10), 1052-1060.

    +

    van Strien, A.J., Termaat, T., Groenendijk, D., Mensing, V. & Kéry, M. (2010). + Site-occupancy models may offer new opportunities for dragonfly monitoring based on daily species lists. + Basic and Applied Ecology, 11, 495-503.

    Examples

    @@ -143,29 +183,61 @@

    Examp # Create data n <- 15000 #size of dataset nyr <- 20 # number of years in data -nSamples <- 100 # set number of dates +nSurveys <- 100 # set number of dates nSites <- 50 # set number of sites # Create somes dates -first <- as.Date(strptime("2010/01/01", "%Y/%m/%d")) -last <- as.Date(strptime(paste(2010+(nyr-1),"/12/31", sep=''), "%Y/%m/%d")) +first <- as.Date(strptime("2010/01/01", "%Y/%m/%d")) +last <- as.Date(strptime(paste(2010+(nyr-1),"/12/31", sep=''), "%Y/%m/%d")) dt <- last-first -rDates <- first + (runif(nSamples)*dt) +rDates <- first + (runif(nSurveys)*dt) # taxa are set as random letters -taxa <- sample(letters, size = n, TRUE) +taxa <- sample(letters, size = n, TRUE) # three sites are visited randomly -site <- sample(paste('A', 1:nSites, sep=''), size = n, TRUE) +site <- sample(paste('A', 1:nSites, sep=''), size = n, TRUE) # the date of visit is selected at random from those created earlier -time_period <- sample(rDates, size = n, TRUE) +survey <- sample(rDates, size = n, TRUE) # run the model with these data for one species formatted_data <- formatOccData(taxa = taxa, site = site, - time_period = time_period) -# } + survey = survey, + includeJDay = TRUE) +# }# NOT RUN { +# Create data with coarser survey information +n <- 1500 #number of species observation in dataset +np <- 10 # number of closure periods in data +nSurveys <- 100 # set number of surveys +nSites <- 20 # set number of sites + +# taxa are set as random letters +taxa <- sample(letters, size = n, TRUE) + +# three sites are visited randomly +site <- sample(paste('A', 1:nSites, sep=''), size = n, TRUE) + +# the date of visit is selected at random from those created earlier +survey <- sample(nSurveys, size = n, TRUE) + +# allocate the surveys randomly to closure periods +cp <- sample(1:np, nSurveys, TRUE) +closure_period <- cp[survey] + +# run the model with these data for one species +formatted_data <- formatOccData(taxa = taxa, + site = site, + survey = survey, + closure_period = closure_period) + +# format the unicorns data +formatted_data <- formatOccData(taxa = unicorns$CONCEPT, + survey = unicorns$Date, + site = unicorns$kmsq) +# }
    +

    + + + diff --git a/docs/reference/frescalo.html b/docs/reference/frescalo.html index 94718b1..e315121 100644 --- a/docs/reference/frescalo.html +++ b/docs/reference/frescalo.html @@ -1,32 +1,51 @@ - + -Frescalo trend analysis — frescalo • sparta +Frescalo trend analysis — frescalo • sparta - + - - + + - + + + + + + - - - + + + + + + + + + + + + - + +
    @@ -84,12 +109,15 @@ -
    +
    +

    A function for using Frescalo (Hill, 2011), a tool for analysing occurrence data when recording effort is not known. This function returns the output from Frescalo to the @@ -97,6 +125,7 @@

    Frescalo trend analysis

    plot_fres to TRUE maps of the results will also be saved. Plotting the returned object gives a useful summary.

    +
    frescalo(Data, frespath, time_periods, site_col, sp_col, year_col = NULL,
       start_col = NULL, end_col = NULL, species_to_include = NULL,
    @@ -105,7 +134,7 @@ 

    Frescalo trend analysis

    alpha = 0.27, trend_option = "arithmetic", NYears = 10, ignore.ireland = F, ignore.channelislands = F)
    -

    Arguments

    +

    Arguments

    @@ -234,43 +263,43 @@

    Value

    $trend

    This dataframe provides the list of time factors for each species

    rll - - Species Name of species - - Time Time period, specified as a class (e.g. 1970); times need not be numeric and are indexed as character strings - - TFactor Time factor, the estimated relative frequency of species at the time - - St_Dev Standard deviation of the time factor, given that spt (defined below) is a weighted sum of binomial variates - - X Number of occurrences of species at the time period - - Xspt Number of occurrences, given reduced weight of locations having very low sampling effort - - Xest Estimated number of occurrences; this should be equal to spt if the algorithm has converged - - N>0.00 Number of locations with non-zero probability of the species occurring - - N>0.98 Number of locations for which the probability of occurrence was estimated as greater than 0.98 + - Species Name of species
    + - Time Time period, specified as a class (e.g. 1970); times need not be numeric and are indexed as character strings
    + - TFactor Time factor, the estimated relative frequency of species at the time
    + - St_Dev Standard deviation of the time factor, given that spt (defined below) is a weighted sum of binomial variates
    + - X Number of occurrences of species at the time period
    + - Xspt Number of occurrences, given reduced weight of locations having very low sampling effort
    + - Xest Estimated number of occurrences; this should be equal to spt if the algorithm has converged
    + - N>0.00 Number of locations with non-zero probability of the species occurring
    + - N>0.98 Number of locations for which the probability of occurrence was estimated as greater than 0.98
    $stat

    Location report

    rll - - Location Name of location; in this case locations are hectads of the GB National Grid - - Loc_no Numbering (added) of locations in alphanumeric order - - No_spp Number of species at that location; the actual number which may be zero - - Phi_in Initial value of phi, the frequency-weighted mean frequency - - Alpha Sampling effort multiplier (to achieve standard value of phi) - - Wgt_n2 effective number N2 for the neighbourhood weights; this is small if there are few floristically similar hectads close to the target hectad. It is (sum weights)^2 / (sum weights^2) - - Phi_out Value of phi after rescaling; constant, if the algorithm has converged - - Spnum_in Sum of neighbourhood frequencies before rescaling - - Spnum_out Estimated species richness, i.e. sum of neighbourhood frequencies after rescaling - - Iter Number of iterations for algorithm to converge + - Location Name of location; in this case locations are hectads of the GB National Grid
    + - Loc_no Numbering (added) of locations in alphanumeric order
    + - No_spp Number of species at that location; the actual number which may be zero
    + - Phi_in Initial value of phi, the frequency-weighted mean frequency
    + - Alpha Sampling effort multiplier (to achieve standard value of phi)
    + - Wgt_n2 effective number N2 for the neighbourhood weights; this is small if there are few floristically similar hectads close to the target hectad. It is (sum weights)^2 / (sum weights^2)
    + - Phi_out Value of phi after rescaling; constant, if the algorithm has converged
    + - Spnum_in Sum of neighbourhood frequencies before rescaling
    + - Spnum_out Estimated species richness, i.e. sum of neighbourhood frequencies after rescaling
    + - Iter Number of iterations for algorithm to converge
    $freq

    Listing of rescaled species frequencies

    rll - - Location Name of location - - Species Name of species - - Pres Record of species in location (1 = recorded, 0 = not recorded) - - Freq Frequency of species in neighbourhood of location - - Freq_1 Estimated probabilty of occurrence, i.e. frequency of species after rescaling - - SD_Frq1 Standard error of Freq_1, calculated on the assumption that Freq is a binomial variate with standard error sqrt(Freq*(1-Freq)/ Wgt_n2), where Wgt_n2 is as defined for samples.txt in section (b) - - Rank Rank of frequency in neighbourhood of location - - Rank_1 Rescaled rank, defined as Rank/Estimated species richness + - Location Name of location
    + - Species Name of species
    + - Pres Record of species in location (1 = recorded, 0 = not recorded)
    + - Freq Frequency of species in neighbourhood of location
    + - Freq_1 Estimated probabilty of occurrence, i.e. frequency of species after rescaling
    + - SD_Frq1 Standard error of Freq_1, calculated on the assumption that Freq is a binomial variate with standard error sqrt(Freq*(1-Freq)/ Wgt_n2), where Wgt_n2 is as defined for samples.txt in section (b)
    + - Rank Rank of frequency in neighbourhood of location
    + - Rank_1 Rescaled rank, defined as Rank/Estimated species richness
    $log

    This records all the output sent to the console when running frescalo

    @@ -278,29 +307,29 @@

    Value

    $lm_stats

    The results of linear modelling of TFactors

    rll - - SPECIES Name of species used internally by frescalo - - NAME Name of species as appears in raw data - - b The slope of the model - - a The intercept - - b_std_err Standard error of the slope - - b_tval t-value for a test of significance of the slope - - b_pval p-value for a test of significance of the slope - - a_std_err Standard error of the intercept - - a_tval t-value for a test of significance of the intercept - - a_pval p-value for a test of significance of the intercept - - adj_r2 Rescaled rank, defined as Rank/Estimated species richness - - r2 t-value for a test of significance of the intercept - - F_val F-value of the model - - F_num_df Degrees of freedom of the model - - F_den_df Denominator degrees of freedom from the F-statistic - - Ymin The earliest year in the dataset - - Ymax The latest year in the dataset - - change_... The percentage change dependent on the values given to trend_option and NYears. + - SPECIES Name of species used internally by frescalo
    + - NAME Name of species as appears in raw data
    + - b The slope of the model
    + - a The intercept
    + - b_std_err Standard error of the slope
    + - b_tval t-value for a test of significance of the slope
    + - b_pval p-value for a test of significance of the slope
    + - a_std_err Standard error of the intercept
    + - a_tval t-value for a test of significance of the intercept
    + - a_pval p-value for a test of significance of the intercept
    + - adj_r2 Rescaled rank, defined as Rank/Estimated species richness
    + - r2 t-value for a test of significance of the intercept
    + - F_val F-value of the model
    + - F_num_df Degrees of freedom of the model
    + - F_den_df Denominator degrees of freedom from the F-statistic
    + - Ymin The earliest year in the dataset
    + - Ymax The latest year in the dataset
    + - change_... The percentage change dependent on the values given to trend_option and NYears.
    The following columns are only produced when there are only two time periods rll - - Z_VAL Z-value for the significance test of the trend - - SIG_95 A logical statement indicating if the trend is significant (TRUE) or non-significant (FALSE) + - Z_VAL Z-value for the significance test of the trend
    + - SIG_95 A logical statement indicating if the trend is significant (TRUE) or non-significant (FALSE)
    @@ -313,11 +342,11 @@

    R

    Examples

    # NOT RUN {
     # Load data
    -data(unicorns)
    +data(unicorns)
     
     # Run frescalo (data is save to the working directory as sinkdir is not given)
     fres_out <- frescalo(Data = unicorns,
    -                     time_periods = data.frame(start=c(1980,1990),end=c(1989,1999)),
    +                     time_periods = data.frame(start=c(1980,1990),end=c(1989,1999)),
                          site_col = 'hectad',
                          sp_col = 'CONCEPT',
                          start_col = 'TO_STARTDATE',
    @@ -341,15 +370,17 @@ 

    Contents

    -

    Site built with pkgdown.

    +

    Site built with pkgdown 1.3.0.

    -
    + + + diff --git a/docs/reference/getBugsData.html b/docs/reference/getBugsData.html new file mode 100644 index 0000000..994afa1 --- /dev/null +++ b/docs/reference/getBugsData.html @@ -0,0 +1,185 @@ + + + + + + + + +Modify the bugs data object depending on the type of model you are running — getBugsData • sparta + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + +
    + +

    This function is primarily for internal use within occDetFunc. It is used to +update the bugs data according to the needs of each model type.

    + +
    + +
    getBugsData(bugs_data, modeltype, verbose = FALSE, occDetData)
    + +

    Arguments

    +
    + + + + + + + + + + + + + + + + + +
    bugs_data

    The bugs data object. This is a list specified in occDetFunc as +list(y = as.numeric(focal), Year = TP, Site = rownum, nyear = nTP, nsite = nrow(zst), +nvisit = nrow(occDetdata[i,])). Where focal is a binary (0/1) of is the focal species is +present, Year is the time periods or survey periods, Site are the site identifiers, nyear is +the number of years in the data, nsite is hte number of sites and nVisit is the number of visits

    modeltype

    Character, one of: intercept, centering, jul_date, catlistlength, contlistlength. +See occDetFunc for more information.

    verbose

    Logical, if true progress is reported to the console

    occDetData

    The 'raw' data used to create the bugs_data. This should have a +column 'L' for list length and a column 'JulDate' for Julian date

    + +

    Value

    + +

    An updated bugs_data object

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown 1.3.0.

    +
    +
    +
    + + + + + + diff --git a/docs/reference/getInitValues.html b/docs/reference/getInitValues.html new file mode 100644 index 0000000..4beaee9 --- /dev/null +++ b/docs/reference/getInitValues.html @@ -0,0 +1,181 @@ + + + + + + + + +Modify the init object depending on the type of model we are running — getInitValues • sparta + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + +
    + +

    This function is primarily for internal use within occDetFunc. It is used to +update the initial values object according to the needs of each model type.

    + +
    + +
    getInitValues(init, modeltype, verbose = FALSE)
    + +

    Arguments

    + + + + + + + + + + + + + + +
    init

    An initial values object. As a minimum this is a list defined in occDetFunc +as list(z = z, alpha.p = rep(runif(1, -2, 2), nTP), a = rep(runif(1, -2, 2), nTP), +eta = rep(runif(1, -2, 2), bugs_data$nsite)). Where z is 1's/0's for whether the focal +species is present, alpha.p is the initial values for detectability in each year, a is +the inital values for the occupancy probability in each year, eta is the initial values +for the site random effects.

    modeltype

    Character, one of: intercept, centering, contlistlength. See occDetFunc for +more information.

    verbose

    Logical, if true progress is reported to the console

    + +

    Value

    + +

    An updated init (initial values) object

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown 1.3.0.

    +
    +
    +
    + + + + + + diff --git a/docs/reference/getModelFile.html b/docs/reference/getModelFile.html new file mode 100644 index 0000000..fd82c45 --- /dev/null +++ b/docs/reference/getModelFile.html @@ -0,0 +1,188 @@ + + + + + + + + +Create a sparta JAGS model file fitting your needs — getModelFile • sparta + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + +
    + +

    This function is primarily for internal use within occDetFunc. It is used to +write a model file that fits the users needs, the path to this file is returned.

    + +
    + +
    getModelFile(modeltype, regional_codes = NULL, region_aggs = NULL,
    +  verbose = FALSE)
    + +

    Arguments

    + + + + + + + + + + + + + + + + + + +
    modeltype

    Character, see occDetFunc for more information.

    regional_codes

    A data.frame object detailing which site is associated with which region. +each row desginates a site and each column represents a region. The first column represents the +site name (as in site). Subsequent columns are named for each regions with 1 representing +the site is in that region and 0 that it is not. NOTE a site should only be in one region

    region_aggs

    A named list giving aggregations of regions that you want trend +estimates for. For example region_aggs = list(GB = c('england', 'scotland', 'wales')) +will produced a trend for GB (Great Britain) as well as its constituent nations. Note that +'england', scotland' and 'wales' must appear as names of columns in regional_codes. +More than one aggregate can be given, eg region_aggs = list(GB = c('england', 'scotland', +'wales'), UK = c('england', 'scotland', 'wales', 'northern_ireland')).

    verbose

    Logical, if true progress is reported to the console

    + +

    Value

    + +

    The path to the model file.

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown 1.3.0.

    +
    +
    +
    + + + + + + diff --git a/docs/reference/getObsModel.html b/docs/reference/getObsModel.html new file mode 100644 index 0000000..e6dbc93 --- /dev/null +++ b/docs/reference/getObsModel.html @@ -0,0 +1,172 @@ + + + + + + + + +Create the observation model component of a sparta JAGS model — getObsModel • sparta + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + +
    + +

    This function is primarily for internal use within getModelFile. It is used to +write an observation model that fits the users needs. The model is returned as a character.

    + +
    + +
    getObsModel(modeltype, verbose = FALSE)
    + +

    Arguments

    + + + + + + + + + + +
    modeltype

    Character, one of: jul_date, catlistlength, contlistlength. +See occDetFunc for more information.

    verbose

    Logical, if true progress is reported to the console

    + +

    Value

    + +

    A character, of JAGS model code, that describes the observation model.

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown 1.3.0.

    +
    +
    +
    + + + + + + diff --git a/docs/reference/getParameters.html b/docs/reference/getParameters.html new file mode 100644 index 0000000..afbb584 --- /dev/null +++ b/docs/reference/getParameters.html @@ -0,0 +1,176 @@ + + + + + + + + +Modify the initial values object depending on the type of model we are running — getParameters • sparta + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    +
    + + + +
    + +
    +
    + + +
    + +

    This function is primarily for internal use within occDetFunc. It is used to +return a vector of parameters to be monitored.

    + +
    + +
    getParameters(parameters, modeltype, verbose = FALSE)
    + +

    Arguments

    + + + + + + + + + + + + + + +
    parameters

    A character vector of parameters you want to monitor.

    modeltype

    Character, one of: indran, jul_date, catlistlength, contlistlength. +See occDetFunc for more information.

    verbose

    Logical, if true progress is reported to the console

    + +

    Value

    + +

    A character, of JAGS model code, that describes the observation model.

    + + +
    + +
    + +
    + + +
    +

    Site built with pkgdown 1.3.0.

    +
    +
    +
    + + + + + + diff --git a/docs/reference/gps_latlon2gr.html b/docs/reference/gps_latlon2gr.html index f3317d9..9b72926 100644 --- a/docs/reference/gps_latlon2gr.html +++ b/docs/reference/gps_latlon2gr.html @@ -1,32 +1,47 @@ - + -Covert latitude and longitude to other formats — gps_latlon2gr • sparta +Covert latitude and longitude to other formats — gps_latlon2gr • sparta - + - - + + - + + + + + + - - - + + + + + + + + + + + + - + + @@ -84,20 +105,24 @@ -
    +
    +

    Coverts latitude and longitude to a grid reference and/or easting and northing.

    +
    gps_latlon2gr(latitude, longitude, out_projection = "OSGB",
       return_type = "both")
    -

    Arguments

    +

    Arguments

    @@ -147,15 +172,17 @@

    Contents

    -

    Site built with pkgdown.

    +

    Site built with pkgdown 1.3.0.

    -
    + + + diff --git a/docs/reference/gr2gps_latlon.html b/docs/reference/gr2gps_latlon.html index d83070c..d9d6c0c 100644 --- a/docs/reference/gr2gps_latlon.html +++ b/docs/reference/gr2gps_latlon.html @@ -1,32 +1,47 @@ - + -Covert grid reference to latitude and longitude — gr2gps_latlon • sparta +Covert grid reference to latitude and longitude — gr2gps_latlon • sparta - + - - + + - + + + + + + - - - + + + + + + + + + + + + - + + @@ -84,20 +105,24 @@ -
    +
    +

    Covert grid reference to latitude and longitude

    +
    gr2gps_latlon(gridref, precision = NULL, projection = "OSGB",
       centre = TRUE)
    -

    Arguments

    +

    Arguments

    @@ -148,15 +173,17 @@

    Contents

    -

    Site built with pkgdown.

    +

    Site built with pkgdown 1.3.0.

    -
    + + + diff --git a/docs/reference/htmlSummary.html b/docs/reference/htmlSummary.html index 90b6112..8895908 100644 --- a/docs/reference/htmlSummary.html +++ b/docs/reference/htmlSummary.html @@ -1,32 +1,47 @@ - + -Create HTML Report — htmlSummary • sparta +Create HTML Report — htmlSummary • sparta - + - - + + - + + + + + + - - - + + + + + + + + + + + + - + + @@ -84,20 +105,24 @@ -
    +
    +

    Create HTML Report for an occDet object.

    +
    -
    htmlSummary(occDet, open = TRUE, output_dir = getwd(), output_file = NULL,
    -  ...)
    +
    htmlSummary(occDet, open = TRUE, output_dir = getwd(),
    +  output_file = NULL, ...)
    -

    Arguments

    +

    Arguments

    @@ -142,15 +167,17 @@

    Contents

    -

    Site built with pkgdown.

    +

    Site built with pkgdown 1.3.0.

    -
    + + + diff --git a/docs/reference/index.html b/docs/reference/index.html index 70300e7..6abc498 100644 --- a/docs/reference/index.html +++ b/docs/reference/index.html @@ -1,6 +1,6 @@ - + @@ -9,24 +9,37 @@ Function reference • sparta - + - - + + - + + + + + + - - - + + + + + + + + + + - + + @@ -84,166 +103,204 @@ -
    -
    +
    +
    -
    -
    +
    - - - - + + + + + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
    -

    All functions

    -

    -
    -

    createWeights

    -

    Create frescalo weights file

    -

    dataDiagnostics

    -

    Data Diagnostics

    -

    date2timeperiod

    -

    Assign dates to time periods

    -

    formatOccData

    -

    Format data for Occupancy detection models

    -

    frescalo

    -

    Frescalo trend analysis

    -

    gps_latlon2gr

    -

    Covert latitude and longitude to other formats

    -

    gr2gps_latlon

    -

    Covert grid reference to latitude and longitude

    -

    htmlSummary

    -

    Create HTML Report

    -

    occDetFunc

    -

    Occupancy detection Function

    -

    occDetModel

    -

    Occupancy detection models

    -

    occurrenceChange

    -

    Calculate percentage change between two years using Bayesian output

    -

    plot

    -

    Plot occDet Objects

    -

    plot_GIS

    -

    Plot GIS shape files

    -

    recsOverTime

    -

    Histogram of records over time

    -

    reportingRateModel

    -

    Run Reporting Rate Models

    -

    siteSelection

    -

    Site selection method

    -

    siteSelectionMinL

    -

    List-length site selection

    -

    siteSelectionMinTP

    -

    Minimum time-period site selection

    -

    sparta

    -

    sparta Trend Analysis for Unstructured Data

    -

    telfer

    -

    Telfer's change index

    -

    unicorns

    -

    A fictional dataset of unicorn sightings

    -

    WSS

    -

    Well sampled sites model

    -
    + + + +

    All functions

    +

    + + + + + +

    WSS()

    + +

    Well sampled sites model

    + + + +

    createWeights()

    + +

    Create frescalo weights file

    + + + +

    dataDiagnostics()

    + +

    Data Diagnostics

    + + + +

    date2timeperiod()

    + +

    Assign dates to time periods

    + + + +

    formatOccData()

    + +

    Format data for Occupancy detection models

    + + + +

    frescalo()

    + +

    Frescalo trend analysis

    + + + +

    gps_latlon2gr()

    + +

    Covert latitude and longitude to other formats

    + + + +

    gr2gps_latlon()

    + +

    Covert grid reference to latitude and longitude

    + + + +

    occDetFunc()

    + +

    Occupancy detection Function

    + + + +

    occDetModel()

    + +

    Occupancy detection models

    + + + +

    plot_GIS()

    + +

    Plot GIS shape files

    + + + +

    recsOverTime()

    + +

    Histogram of records over time

    + + + +

    reportingRateModel()

    + +

    Run Reporting Rate Models

    + + + +

    siteSelection()

    + +

    Site selection method

    + + + +

    siteSelectionMinL()

    + +

    List-length site selection

    + + + +

    siteSelectionMinTP()

    + +

    Minimum time-period site selection

    + + + +

    sparta

    + +

    sparta Trend Analysis for Unstructured Data

    + + + +

    telfer()

    + +

    Telfer's change index

    + + + +

    unicorns

    + +

    A fictional dataset of unicorn sightings

    + + + +

    occurrenceChange()

    + +

    Calculate percentage change between two years using Bayesian output

    + + + +

    htmlSummary()

    + +

    Create HTML Report

    + + + +

    plot(<occDet>)

    + +

    Plot occDet Objects

    + + + +

    getBugsData()

    + +

    Modify the bugs data object depending on the type of model you are running

    + + + +

    getInitValues()

    + +

    Modify the init object depending on the type of model we are running

    + + + +

    getModelFile()

    + +

    Create a sparta JAGS model file fitting your needs

    + + + +

    getObsModel()

    + +

    Create the observation model component of a sparta JAGS model

    + + + +

    getParameters()

    + +

    Modify the initial values object depending on the type of model we are running

    + + + +

    detection_phenology()

    + +

    Diagnostics for the detectability with respect to Julian Date

    + + + +

    simOccData()

    + +

    Simulate for Occupancy detection models

    + + +
    + + + diff --git a/docs/reference/occDetFunc.html b/docs/reference/occDetFunc.html index 9f5c332..8dbb060 100644 --- a/docs/reference/occDetFunc.html +++ b/docs/reference/occDetFunc.html @@ -1,32 +1,47 @@ - + -Occupancy detection Function — occDetFunc • sparta +Occupancy detection Function — occDetFunc • sparta - + - - + + - + + + + + + - - - + + + + + + + + + + + + - + +
    @@ -84,24 +105,29 @@ -
    +
    +

    Run occupancy detection models using the output from formatOccData

    +
    -
    occDetFunc(taxa_name, occDetdata, spp_vis, n_iterations = 5000, nyr = 2,
    -  burnin = 1500, thinning = 3, n_chains = 3, write_results = TRUE,
    -  output_dir = getwd(), modeltype = "sparta", seed = NULL,
    -  model.function = NULL, regional_codes = NULL, region_aggs = NULL,
    +    
    occDetFunc(taxa_name, occDetdata, spp_vis, n_iterations = 5000,
    +  nyr = 2, burnin = 1500, thinning = 3, n_chains = 3,
    +  write_results = TRUE, output_dir = getwd(), modeltype = "sparta",
    +  max_year = NULL, seed = NULL, model.function = NULL,
    +  regional_codes = NULL, region_aggs = NULL,
       additional.parameters = NULL, additional.BUGS.elements = NULL,
    -  additional.init.values = NULL, return_data = FALSE)
    + additional.init.values = NULL, return_data = TRUE)
    -

    Arguments

    +

    Arguments

    @@ -122,7 +148,7 @@

    Ar

    - @@ -153,6 +179,11 @@

    Ar

    + + + + @@ -187,7 +218,7 @@

    Ar

    +to R2jags::jags 'data' argument

    @@ -202,36 +233,81 @@

    Ar

    Value

    -

    A list including the model, bugs model output, the path of the model file used and information on the number of iterations, first year, last year, etc.

    +

    A list including the model, JAGS model output, the path of the model file used and information on the number of iterations, first year, last year, etc. +Key aspects of the model output include:

      +
    • "out$model" - The model used as provided to JAGS. Also contained is a list of fully observed variables. These are those listed in the BUGS data.

    • +
    • "out$BUGSoutput$n.chains" - The number of Markov chains ran in the MCMC simulations.

    • +
    • "out$BUGSoutput$n.iter" - The total number of iterations per chain.

    • +
    • "out$BUGSoutput$n.burnin" - The number of interations discarded from the start as a burn-in period.

    • +
    • "out$BUGSoutput$n.thin" - The thinning rate used. For example a thinning rate of 3 retains only every third iteration. This is used to reduce autocorrelation.

    • +
    • "out$BUGSoutput$n.keep" - The number of iterations kept per chain. This is the total number of iterations minus the burn-in then divided by the thinning rate.

    • +
    • "out$BUGSoutput$n.sims" - The total number of iterations kept.

    • +
    • "out$BUGSoutput$summary" - A summary table of the monitored parameter. The posterior distribution for each parameter is summaried with the mean, standard deviation, various credible intervals, a formal convergence metric (Rhat), and a measure of effective sample size (n.eff).

    • +
    • "out$BUGSoutput$mean" - the mean values for all monitored parameters

    • +
    • "out$BUGSoutput$sd" - the standard deviation values for all monitored parameters

    • +
    • "out$BUGSoutput$median" - the median values for all monitored parameters

    • +
    • "out$parameters.to.save" - The names of all monitored parameters.

    • +
    • "out$BUGSoutput$model.file" - The user provided or temporary generated model file detailing the occupancy model.

    • +
    • "out$n.iter" - The total number of interations per chain.

    • +
    • "out$DIC" - Whether the Deviance Information Criterion (DIC) is calculated.

    • +
    • "out$BUGSoutput$sims.list" - A list of the posterior distribution for each monitored parameter. Use sims.array and sims.matrix if a different format of the posteriors is desired.

    • +
    • "out$SPP_NAME" - The name of the study species.

    • +
    • "out$min_year" - First year of data included in the occupancy model run.

    • +
    • "out$max_year" - Final year of data included in the occupancy model run.

    • +
    • "out$nsite" - The number of unique sites included int he occupancy model run.

    • +
    • "out$nvisits" - The number of unique visits included int he occupancy model run.

    • +
    • "out$species_sites" - The number of unique sites the species of interest was recorded in.

    • +
    • "out$species_observations" - The number of unique records for the species of interest.

    • +
    • "out$regions" - The names of the regions included in the model run.

    • +
    +

    Details

    This function requires both the R package R2jags and the program JAGS. These are not installed by default when sparta is loaded and so should be -installed by the user. More details can be found in teh vignette. - modeltype is used to choose the model as well as the initial values, -and the parameter to monitor. There are 9 elements that define models, however not -all combinations are available in sparta. You will get an error if you try and use +installed by the user. More details can be found in teh vignette.

    +

    modeltype is used to choose the model as well as the associated initial values, +and parameters to monitor. Elements to choose from can be separated into the following components:

    +

    A. Prior type: this has 3 options, each of which was tested in Outhwaite et al (in review): + 1. sparta - This uses the same model as in Isaac et al (2014). + 2. indran - This is the adaptive stationary model. + 3. ranwalk - This is the random walk model.

    +

    B. Hyperprior type: This has 3 options, each of these are discussed in Outhwaite et al (in review): + 1. halfuniform - the original formulation in Isaac et al (2014). + 2. halfcauchy - preferred form, tested in Outhwaite et al (2018). + 3. inversegamma - alternative form presented in the literature.

    +

    C. List length specification: This has 3 options: + 1. catlistlength - list length as a categorical variable. + 2. contlistlength - list length as a continuous variable. + 3. nolistlength - no list length variable.

    +

    D. Julian date: this is an additional option for including Julian date within the detection model: + 1. jul_date.

    +

    Not all combinations are available in sparta. You will get an error if you try and use a combination that is not supported. There is usually a good reason why that -combination is not a good idea. Here are the model elements available.

      +combination is not a good idea. Here are the model elements available:

      +
      • "sparta" - This uses the same model as in Isaac et al (2014)

      • "indran" - Here the prior for the year effect of the state model is modelled as a random effect. This allows the model to adapt to interannual variability.

      • -
      • "inversegamma" - Includes inverse-gamma hyperpriors for random effects within the model

      • -
      • "intercept" - Includes an intercept term in the state and observation model. By including intercept terms, the occupancy and detection probabilities in each year are centred on an overall mean level.

      • -
      • "centering" - Includes hierarchical centering of the model parameters. Centring does not change the model explicitly but writes it in a way that allows parameter estimates to be updated simultaneously.

      • "ranwalk" - Here the prior for the year effect of the state model is modelled as a random walk. Each estimate for the year effect is dependent on that of the previous year.

      • -
      • "halfcauchy" - Includes half-Cauchy hyperpriors for all random effects within the model. The half-Cauchy is a special case of the Student’s t distribution with 1 degree of freedom.

      • +
      • "halfcauchy" - Includes half-Cauchy hyperpriors for all random effects within the model. The half-Cauchy is a special case of the Student’s t distribution with 1 degree of freedom.

      • +
      • "inversegamma" - Includes inverse-gamma hyperpriors for random effects within the model

      • "catlistlength" - This specifies that list length should be considered as a catagorical variable. There are 3 classes, lists of length 1, 2-3, and 4 and over. If none of the list length options are specifed 'contlistlength' is used

      • "contlistlength" - This specifies that list length should be considered as a continious variable. If none of the list length options are specifed 'contlistlength' is used

      • "nolistlength" - This specifies that no list length should be used. If none of the list length options are specifed 'contlistlength' is used

      • -
      • "jul_date" - This adds Julian date to the model as a polynomial centered on the middle of the year. Note your data must include Julian day (use formatOccData(..., includeJDay = TRUE))

      • -

      These options are provided as a vector of characters, e.g. modeltype = c('indran', 'centering', 'halfcauchy', 'catlistlength')

      +
    • "jul_date" - This adds Julian date to the model as a normal distribution with its mean and standard deviation as monitered parameters.

    • +
    • "intercept" - No longer available. Includes an intercept term in the state and observation model. By including intercept terms, the occupancy and detection probabilities in each year are centred on an overall mean level.

    • +
    • "centering" - No longer available. Includes hierarchical centering of the model parameters. Centring does not change the model explicitly but writes it in a way that allows parameter estimates to be updated simultaneously.

    • +

    These options are provided as a vector of characters, e.g. modeltype = c('indran', 'halfcauchy', 'catlistlength')

    References

    Isaac, N.J.B., van Strien, A.J., August, T.A., de Zeeuw, M.P. and Roy, D.B. (2014). Statistics for citizen science: extracting signals of change from noisy ecological data. - Methods in Ecology and Evolution, 5 (10), 1052-1060.

    + Methods in Ecology and Evolution, 5: 1052-1060.

    +

    Outhwaite, C.L., Chandler, R.E., Powney, G.D., Collen, B., Gregory, R.D. & Isaac, N.J.B. (2018). + Prior specification in Bayesian occupancy modelling improves analysis of species occurrence data. + Ecological Indicators, 93: 333-343.

    Examples

    @@ -243,22 +319,22 @@

    Examp nSites <- 50 # set number of sites # Create somes dates -first <- as.Date(strptime("2010/01/01", "%Y/%m/%d")) -last <- as.Date(strptime(paste(2010+(nyr-1),"/12/31", sep=''), "%Y/%m/%d")) +first <- as.Date(strptime("2010/01/01", "%Y/%m/%d")) +last <- as.Date(strptime(paste(2010+(nyr-1),"/12/31", sep=''), "%Y/%m/%d")) dt <- last-first -rDates <- first + (runif(nSamples)*dt) +rDates <- first + (runif(nSamples)*dt) # taxa are set as random letters -taxa <- sample(letters, size = n, TRUE) +taxa <- sample(letters, size = n, TRUE) # sites are visited randomly -site <- sample(paste('A', 1:nSites, sep=''), size = n, TRUE) +site <- sample(paste('A', 1:nSites, sep=''), size = n, TRUE) # the date of visit is selected at random from those created earlier -time_period <- sample(rDates, size = n, TRUE) +survey <- sample(rDates, size = n, TRUE) # Format the data -visitData <- formatOccData(taxa = taxa, site = site, time_period = time_period) +visitData <- formatOccData(taxa = taxa, site = site, survey = survey) # run the model with these data for one species (very small number of iterations) results <- occDetFunc(taxa_name = taxa[1], @@ -288,15 +364,17 @@

    Contents

    -

    Site built with pkgdown.

    +

    Site built with pkgdown 1.3.0.

    -
    + + + diff --git a/docs/reference/occDetModel.html b/docs/reference/occDetModel.html index 62c53b2..a825310 100644 --- a/docs/reference/occDetModel.html +++ b/docs/reference/occDetModel.html @@ -1,32 +1,47 @@ - + -Occupancy detection models — occDetModel • sparta +Occupancy detection models — occDetModel • sparta - + - - + + - + + + + + + - - - + + + + + + + + + + + + - + + @@ -84,25 +105,29 @@ -
    +
    +

    Run occupancy detection models as described in Isaac et al, 2014

    +
    -
    occDetModel(taxa, site, time_period, species_list = unique(taxa),
    -  write_results = TRUE, output_dir = getwd(), nyr = 2,
    +    
    occDetModel(taxa, site, survey, species_list = unique(taxa),
    +  write_results = TRUE, output_dir = getwd(), nyr = 2,
       n_iterations = 5000, burnin = 1500, thinning = 3, n_chains = 3,
       modeltype = "sparta", regional_codes = NULL, region_aggs = NULL,
    -  model.function = NULL, seed = NULL, additional.parameters = NULL,
    -  additional.BUGS.elements = NULL, additional.init.values = NULL,
    -  return_data = FALSE)
    + model.function = NULL, max_year = NULL, seed = NULL, + additional.parameters = NULL, additional.BUGS.elements = NULL, + additional.init.values = NULL, return_data = FALSE)
    -

    Arguments

    +

    Arguments

    nyr

    numeric, the minimum number of years on which a site must have records for it +

    numeric, the minimum number of periods on which a site must have records for it to be included in the models. Defaults to 2

    modeltype

    A character string or vector of strings that specifies the model to use. See details. If used then model.function is ignored.

    max_year

    numeric, final year to which analysis will be run, this can be set if it is beyond +the limit of the dataset. Defaults to final year of the dataset.

    seed
    additional.BUGS.elements

    A named list giving additioanl bugs elements passed -to R2jags::jags 'data' argument

    additional.init.values
    @@ -114,9 +139,9 @@

    Ar

    - - + + @@ -181,6 +206,11 @@

    Ar

    + + + + @@ -194,7 +224,7 @@

    Ar

    +to R2jags::jags 'data' argument

    @@ -215,30 +245,48 @@

    Details

    This function requires both the R package R2jags and the program JAGS. These are not installed by default when sparta is loaded and so should be -installed by the user. More details can be found in teh vignette. - modeltype is used to choose the model as well as the initial values, -and the parameter to monitor. There are 9 elements that define models, however not -all combinations are available in sparta. You will get an error if you try and use +installed by the user. More details can be found in teh vignette.

    +

    modeltype is used to choose the model as well as the associated initial values, +and parameters to monitor. Elements to choose from can be separated into the following components:

    +

    A. Prior type: this has 3 options, each of which was tested in Outhwaite et al (in review): + 1. sparta - This uses the same model as in Isaac et al (2014). + 2. indran - This is the adaptive stationary model. + 3. ranwalk - This is the random walk model.

    +

    B. Hyperprior type: This has 3 options, each of these are discussed in Outhwaite et al (in review): + 1. halfuniform - the original formulation in Isaac et al (2014). + 2. halfcauchy - preferred form, tested in Outhwaite et al (in review). + 3. inversegamma - alternative form presented in the literature.

    +

    C. List length specification: This has 3 options: + 1. catlistlength - list length as a categorical variable. + 2. contlistlength - list length as a continuous variable. + 3. nolistlength - no list length variable.

    +

    D. Julian date: this is an additional option for including Julian date within the detection model: + 1. jul_date.

    +

    Not all combinations are available in sparta. You will get an error if you try and use a combination that is not supported. There is usually a good reason why that -combination is not a good idea. Here are the model elements available.

      +combination is not a good idea. Here are the model elements available:

      +
      • "sparta" - This uses the same model as in Isaac et al (2014)

      • "indran" - Here the prior for the year effect of the state model is modelled as a random effect. This allows the model to adapt to interannual variability.

      • -
      • "inversegamma" - Includes inverse-gamma hyperpriors for random effects within the model

      • -
      • "intercept" - Includes an intercept term in the state and observation model. By including intercept terms, the occupancy and detection probabilities in each year are centred on an overall mean level.

      • -
      • "centering" - Includes hierarchical centering of the model parameters. Centring does not change the model explicitly but writes it in a way that allows parameter estimates to be updated simultaneously.

      • "ranwalk" - Here the prior for the year effect of the state model is modelled as a random walk. Each estimate for the year effect is dependent on that of the previous year.

      • -
      • "halfcauchy" - Includes half-Cauchy hyperpriors for all random effects within the model. The half-Cauchy is a special case of the Student’s t distribution with 1 degree of freedom.

      • +
      • "halfcauchy" - Includes half-Cauchy hyperpriors for all random effects within the model. The half-Cauchy is a special case of the Student’s t distribution with 1 degree of freedom.

      • +
      • "inversegamma" - Includes inverse-gamma hyperpriors for random effects within the model

      • "catlistlength" - This specifies that list length should be considered as a catagorical variable. There are 3 classes, lists of length 1, 2-3, and 4 and over. If none of the list length options are specifed 'contlistlength' is used

      • "contlistlength" - This specifies that list length should be considered as a continious variable. If none of the list length options are specifed 'contlistlength' is used

      • "nolistlength" - This specifies that no list length should be used. If none of the list length options are specifed 'contlistlength' is used

      • -
      • "jul_date" - This adds Julian date to the model as a polynomial centered on the middle of the year.

      • -

      These options are provided as a vector of characters, e.g. modeltype = c('indran', 'centering', 'halfcauchy', 'catlistlength')

      +
    • "jul_date" - This adds Julian date to the model as a normal distribution with its mean and standard deviation as monitered parameters.

    • +
    • "intercept" - No longer available. Includes an intercept term in the state and observation model. By including intercept terms, the occupancy and detection probabilities in each year are centred on an overall mean level.

    • +
    • "centering" - No longer available. Includes hierarchical centering of the model parameters. Centring does not change the model explicitly but writes it in a way that allows parameter estimates to be updated simultaneously.

    • +

    These options are provided as a vector of characters, e.g. modeltype = c('indran', 'halfcauchy', 'catlistlength')

    References

    Isaac, N.J.B., van Strien, A.J., August, T.A., de Zeeuw, M.P. and Roy, D.B. (2014). Statistics for citizen science: extracting signals of change from noisy ecological data. - Methods in Ecology and Evolution, 5 (10), 1052-1060.

    + Methods in Ecology and Evolution, 5: 1052-1060.

    +

    Outhwaite, C.L., Chandler, R.E., Powney, G.D., Collen, B., Gregory, R.D. & Isaac, N.J.B. (2018). + Prior specification in Bayesian occupancy modelling improves analysis of species occurrence data. + Ecological Indicators, 93: 333-343.

    Roy, H.E., Adriaens, T., Isaac, N.J.B. et al. (2012) Invasive alien predator causes rapid declines of native European ladybirds. Diversity & Distributions, 18, 717-725.

    @@ -247,32 +295,32 @@

    R

    Examples

    # NOT RUN {
     # Create data
    -set.seed(125)
    +set.seed(125)
     n <- 15000 #size of dataset
     nyr <- 20 # number of years in data
     nSamples <- 100 # set number of dates
     nSites <- 50 # set number of sites
     
     # Create somes dates
    -first <- as.Date(strptime("1980/01/01", "%Y/%m/%d"))
    -last <- as.Date(strptime(paste(1980+(nyr-1),"/12/31", sep=''), "%Y/%m/%d"))
    +first <- as.Date(strptime("1980/01/01", "%Y/%m/%d"))
    +last <- as.Date(strptime(paste(1980+(nyr-1),"/12/31", sep=''), "%Y/%m/%d"))
     dt <- last-first
    -rDates <- first + (runif(nSamples)*dt)
    +rDates <- first + (runif(nSamples)*dt)
     
     # taxa are set as random letters
    -taxa <- sample(letters, size = n, TRUE)
    +taxa <- sample(letters, size = n, TRUE)
     
     # three sites are visited randomly
    -site <- sample(paste('A', 1:nSites, sep=''), size = n, TRUE)
    +site <- sample(paste('A', 1:nSites, sep=''), size = n, TRUE)
     
     # the date of visit is selected at random from those created earlier
    -time_period <- sample(rDates, size = n, TRUE)
    +survey <- sample(rDates, size = n, TRUE)
     
     # run the model with these data for one species
     # using defaults
     results <- occDetModel(taxa = taxa,
                            site = site,
    -                       time_period = time_period,
    +                       survey = survey,
                            species_list = 'a',
                            write_results = FALSE,
                            n_iterations = 1000,
    @@ -282,35 +330,35 @@ 

    Examp # run with a different model type results <- occDetModel(taxa = taxa, site = site, - time_period = time_period, + survey = survey, species_list = 'a', write_results = FALSE, n_iterations = 1000, burnin = 10, thinning = 2, seed = 125, - modeltype = c("indran", "intercept")) + modeltype = c("indran", "intercept")) # run with regions # Create region definitions -regions <- data.frame(site = unique(site), - region1 = c(rep(1, 20), rep(0, 30)), - region2 = c(rep(0, 20), rep(1, 15), rep(0, 15)), - region3 = c(rep(0, 20), rep(0, 15), rep(1, 15))) +regions <- data.frame(site = unique(site), + region1 = c(rep(1, 20), rep(0, 30)), + region2 = c(rep(0, 20), rep(1, 15), rep(0, 15)), + region3 = c(rep(0, 20), rep(0, 15), rep(1, 15))) results <- occDetModel(taxa = taxa, site = site, - time_period = time_period, + survey = survey, species_list = 'a', write_results = FALSE, n_iterations = 1000, burnin = 10, thinning = 2, seed = 125, - modeltype = c("indran", "intercept"), + modeltype = c("indran", "intercept"), regional_codes = regions, - region_aggs = list(agg1 = c('region1', 'region2'))) + region_aggs = list(agg1 = c('region1', 'region2'))) # }

    + + + diff --git a/docs/reference/occurrenceChange.html b/docs/reference/occurrenceChange.html index 90a8b17..550b9c9 100644 --- a/docs/reference/occurrenceChange.html +++ b/docs/reference/occurrenceChange.html @@ -1,32 +1,52 @@ - + -Calculate percentage change between two years using Bayesian output — occurrenceChange • sparta +Calculate percentage change between two years using Bayesian output — occurrenceChange • sparta - + - - + + - + + + + + + - - - + + + + + + + + + + + + - + + @@ -84,12 +110,15 @@ -
    +
    +

    Using the data returned from occDetModel this function models a linear trend between two years for each iteration of the models. The predicted @@ -98,10 +127,12 @@

    Calculate percentage change between two years using Bayesian output

    interations of the model. This distribution of the results is used to calculate the mean estimate and the 95

    +
    -
    occurrenceChange(firstYear, lastYear, bayesOut)
    +
    occurrenceChange(firstYear, lastYear, bayesOut, change = "growthrate",
    +  region = NULL)
    -

    Arguments

    +

    Arguments

    A character vector of site names, as long as the number of observations.

    time_period

    A numeric vector of user defined time periods, or a date vector, -as long as the number of observations.

    survey

    A vector as long as the number of observations. +This must be a Date if includeJDay = TRUE

    species_listmodel.function

    optionally a user defined BUGS model coded as a function (see ?jags, including the example there, for how this is done)

    max_year

    numeric, final year to which analysis will be run, this can be set if it is beyond +the limit of the dataset. Defaults to final year of the dataset.

    seed
    additional.BUGS.elements

    A named list giving additioanl bugs elements passed -to R2jags::jags 'data' argument

    additional.init.values
    @@ -116,13 +147,33 @@

    Ar

    + + + + + + + +
    bayesOut

    occDet object as returned from occDetModel

    change

    A character string that specifies the type of change to be calculated, the default +is annual growth rate. See details for options.

    region

    A character string specifying the region name if change is to be determined regional estimates of occupancy. +Region names must match those in the model output.

    Value

    -

    A list giving the mean, credible intervals and raw data from the +

    A list giving the mean, median, credible intervals and raw data from the estimations.

    +

    Details

    + +

    change is used to specify which change measure to be calculated. +There are four options to choose from: difference, percentdif, growthrate and +lineargrowth.

    +

    difference calculates the simple difference between the first and last year.

    +

    percentdif calculates the percentage difference between the first and last year.

    +

    growthrate calculates the annual growth rate across years.

    +

    lineargrowth calculates the linear growth rate from a linear model.

    +

    Examples

    # NOT RUN {
    @@ -133,25 +184,25 @@ 

    Examp nSites <- 50 # set number of sites # Create somes dates -first <- as.Date(strptime("1980/01/01", "%Y/%m/%d")) -last <- as.Date(strptime(paste(1980+(nyr-1),"/12/31", sep=''), "%Y/%m/%d")) +first <- as.Date(strptime("1980/01/01", "%Y/%m/%d")) +last <- as.Date(strptime(paste(1980+(nyr-1),"/12/31", sep=''), "%Y/%m/%d")) dt <- last-first -rDates <- first + (runif(nSamples)*dt) +rDates <- first + (runif(nSamples)*dt) # taxa are set as random letters -taxa <- sample(letters, size = n, TRUE) +taxa <- sample(letters, size = n, TRUE) # three sites are visited randomly -site <- sample(paste('A', 1:nSites, sep=''), size = n, TRUE) +site <- sample(paste('A', 1:nSites, sep=''), size = n, TRUE) # the date of visit is selected at random from those created earlier -time_period <- sample(rDates, size = n, TRUE) +survey <- sample(rDates, size = n, TRUE) # run the model with these data for one species results <- occDetModel(taxa = taxa, site = site, - time_period = time_period, - species_list = c('a','m','g'), + survey = survey, + species_list = c('a','m','g'), write_results = FALSE, n_iterations = 1000, burnin = 10, @@ -169,6 +220,8 @@

    Contents

  • Arguments
  • Value
  • + +
  • Details
  • Examples
  • @@ -178,15 +231,17 @@

    Contents

    -

    Site built with pkgdown.

    +

    Site built with pkgdown 1.3.0.

    -
    + + + diff --git a/docs/reference/plot.occDet.html b/docs/reference/plot.occDet.html index 63d4f15..100a6a9 100644 --- a/docs/reference/plot.occDet.html +++ b/docs/reference/plot.occDet.html @@ -1,32 +1,47 @@ - + -Plot occDet Objects — plot.occDet • sparta +Plot occDet Objects — plot.occDet • sparta - + - - + + - + + + + + + - - - + + + + + + + + + + + + - + +
    @@ -84,20 +105,25 @@ -
    +
    +

    Plot occDet Objects

    +
    # S3 method for occDet
    -plot(x, y = NULL, main = x$SPP_NAME, reg_agg = "", ...)
    +plot(x, y = NULL, main = x$SPP_NAME, reg_agg = "", + ...) -

    Arguments

    +

    Arguments

    @@ -136,15 +162,17 @@

    Contents

    -

    Site built with pkgdown.

    +

    Site built with pkgdown 1.3.0.

    -
    + + + diff --git a/docs/reference/plot_GIS.html b/docs/reference/plot_GIS.html index 90ddad6..eeb9e4c 100644 --- a/docs/reference/plot_GIS.html +++ b/docs/reference/plot_GIS.html @@ -1,32 +1,48 @@ - + -Plot GIS shape files — plot_GIS • sparta +Plot GIS shape files — plot_GIS • sparta - + - - + + - + + + + + + - - - + + + + + + + + + + + + - + + @@ -84,26 +106,31 @@ -
    +
    +

    This function can be used to plot gis data that has been loaded from shape files using the readShapePoly() function contained in the 'maptools' R package.

    +
    plot_GIS(gis_data = NULL, main = "", xlab = "", ylab = "",
       xlim = NULL, ylim = NULL, show.axis = TRUE, show.grid = TRUE,
    -  grid.div = 1, round.grid = FALSE, grid.col = "grey", fill.col = NA,
    -  line.col = NULL, bg.col = "white", box.col = NA, new.window = TRUE,
    -  no.margin = FALSE, set.margin = TRUE, max.dimen = 13, cex.main = 1.2,
    -  cex.lab = 1, cex.axis = 0.8, blank.plot = FALSE, plot.shape = TRUE,
    -  additions = FALSE, return.dimen = TRUE)
    + grid.div = 1, round.grid = FALSE, grid.col = "grey", + fill.col = NA, line.col = NULL, bg.col = "white", box.col = NA, + new.window = TRUE, no.margin = FALSE, set.margin = TRUE, + max.dimen = 13, cex.main = 1.2, cex.lab = 1, cex.axis = 0.8, + blank.plot = FALSE, plot.shape = TRUE, additions = FALSE, + return.dimen = TRUE) -

    Arguments

    +

    Arguments

    @@ -274,15 +301,17 @@

    Contents

    -

    Site built with pkgdown.

    +

    Site built with pkgdown 1.3.0.

    -
    + + + diff --git a/docs/reference/recsOverTime.html b/docs/reference/recsOverTime.html index 6a9cfa9..ed98592 100644 --- a/docs/reference/recsOverTime.html +++ b/docs/reference/recsOverTime.html @@ -1,32 +1,49 @@ - + -Histogram of records over time — recsOverTime • sparta +Histogram of records over time — recsOverTime • sparta - + - - + + - + + + + + + - - - + + + + + + + + + + + + - + + @@ -84,22 +107,26 @@ -
    +
    +

    This is a useful function for visualising your data and how the number of records change over time. This is key for understanding biases that may be present in your data (Isaac et al, 2014).

    +
    recsOverTime(time_period, Log = FALSE, col = "black", xlab = "Year",
    -  ylab = ifelse(Log, "log(Frequency)", "Frequency"), ...)
    + ylab = ifelse(Log, "log(Frequency)", "Frequency"), ...) -

    Arguments

    +

    Arguments

    @@ -113,19 +140,19 @@

    Ar

    - + - + - + - +
    col

    Passed to barplot, the colour of bars

    Passed to barplot, the colour of bars

    xlab

    Passed to barplot, the x-axis label

    Passed to barplot, the x-axis label

    ylab

    Passed to barplot, the y-axis label

    Passed to barplot, the y-axis label

    ...

    other arguements to pass to barplot

    other arguements to pass to barplot

    @@ -143,30 +170,30 @@

    Examp nSites <- 15 # set number of sites # Create somes dates -first <- as.POSIXct(strptime("2010/01/01", "%Y/%m/%d")) -last <- as.POSIXct(strptime(paste(2010+(nyr-1),"/12/31", sep=''), "%Y/%m/%d")) +first <- as.POSIXct(strptime("2010/01/01", "%Y/%m/%d")) +last <- as.POSIXct(strptime(paste(2010+(nyr-1),"/12/31", sep=''), "%Y/%m/%d")) dt <- last-first -rDates <- first + (runif(nSamples)*dt) +rDates <- first + (runif(nSamples)*dt) # taxa are set as random letters -taxa <- sample(letters, size = n, TRUE) +taxa <- sample(letters, size = n, TRUE) # three sites are visited randomly -site <- sample(paste('A', 1:nSites, sep=''), size = n, TRUE) +site <- sample(paste('A', 1:nSites, sep=''), size = n, TRUE) # the date of visit is selected at random from those created earlier -time_period <- sample(rDates, size = n, TRUE) +time_period <- sample(rDates, size = n, TRUE) # combine this to a dataframe (adding a final row of 'bad' data) -df <- data.frame(taxa = c(taxa,'bad'), - site = c(site,'A1'), - time_period = c(time_period, as.POSIXct(strptime("1200/01/01", "%Y/%m/%d")))) +df <- data.frame(taxa = c(taxa,'bad'), + site = c(site,'A1'), + time_period = c(time_period, as.POSIXct(strptime("1200/01/01", "%Y/%m/%d")))) # This reveals the 'bad data' recsOverTime(df$time_period) # remove and replot -df <- df[format(df$time_period, '%Y') > 2000, ] +df <- df[format(df$time_period, '%Y') > 2000, ] recsOverTime(df$time_period) # plot with style @@ -190,15 +217,17 @@

    Contents

    -

    Site built with pkgdown.

    +

    Site built with pkgdown 1.3.0.

    -
    + + + diff --git a/docs/reference/reportingRateModel.html b/docs/reference/reportingRateModel.html index 2b829ca..3161733 100644 --- a/docs/reference/reportingRateModel.html +++ b/docs/reference/reportingRateModel.html @@ -1,32 +1,47 @@ - + -Run Reporting Rate Models — reportingRateModel • sparta +Run Reporting Rate Models — reportingRateModel • sparta - + - - + + - + + + + + + - - - + + + + + + + + + + + + - + +
    @@ -84,22 +105,26 @@ -
    +
    +

    Run reporting rate models to assess the change in species occurrence over time.

    +
    reportingRateModel(taxa, site, time_period, list_length = FALSE,
    -  site_effect = FALSE, species_to_include = unique(taxa),
    +  site_effect = FALSE, species_to_include = unique(taxa),
       overdispersion = FALSE, verbose = FALSE, family = "Binomial",
       print_progress = FALSE)
    -

    Arguments

    +

    Arguments

    @@ -162,7 +187,7 @@

    Value

    number_observations gives the number of visits where the species of interest was observed. If any of the models encountered an error this will be given in the column error_message. If model do encounter errors the the values for most - columns will be NA

    + columns will be NA

    The data.frame has a number of attributes:

    • intercept_year - The year used for the intercept (i.e. the year whose value is set to 0). Setting the intercept to the median year helps @@ -191,28 +216,28 @@

      Examp nSites <- 15 # set number of sites # Create somes dates -first <- as.POSIXct(strptime("2010/01/01", "%Y/%m/%d")) -last <- as.POSIXct(strptime(paste(2010+(nyr-1),"/12/31", sep=''), "%Y/%m/%d")) +first <- as.POSIXct(strptime("2010/01/01", "%Y/%m/%d")) +last <- as.POSIXct(strptime(paste(2010+(nyr-1),"/12/31", sep=''), "%Y/%m/%d")) dt <- last-first -rDates <- first + (runif(nSamples)*dt) +rDates <- first + (runif(nSamples)*dt) # taxa are set as random letters -taxa <- sample(letters, size = n, TRUE) +taxa <- sample(letters, size = n, TRUE) # three sites are visited randomly -site <- sample(paste('A', 1:nSites, sep=''), size = n, TRUE) +site <- sample(paste('A', 1:nSites, sep=''), size = n, TRUE) # the date of visit is selected at random from those created earlier -time_period <- sample(rDates, size = n, TRUE) +time_period <- sample(rDates, size = n, TRUE) # combine this to a dataframe (adding a final row of 'bad' data) -df <- data.frame(taxa = c(taxa,'bad'), - site = c(site,'A1'), - time_period = c(time_period, as.POSIXct(strptime("1200/01/01", "%Y/%m/%d")))) +df <- data.frame(taxa = c(taxa,'bad'), + site = c(site,'A1'), + time_period = c(time_period, as.POSIXct(strptime("1200/01/01", "%Y/%m/%d")))) # Run the model RR_out <- reportingRateModel(df$taxa, df$site, df$time_period, print_progress = TRUE) -head(RR_out) +head(RR_out) # } @@ -233,15 +258,17 @@

      Contents

      -

      Site built with pkgdown.

      +

      Site built with pkgdown 1.3.0.

      -
      + + + diff --git a/docs/reference/simOccData.html b/docs/reference/simOccData.html new file mode 100644 index 0000000..bf27fb4 --- /dev/null +++ b/docs/reference/simOccData.html @@ -0,0 +1,188 @@ + + + + + + + + +Simulate for Occupancy detection models — simOccData • sparta + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
      +
      + + + +
      + +
      +
      + + +
      + +

      Simulates some data suitable for use in sparta +The user defines the parameters for the data generation +At present it works with just one species and generates the list length probabalistically

      + +
      + +
      simOccData(nsites = 20, nvisits = 100, nTP = 10, psi = 0.5,
      +  trend = -0.01, mu.lp = -1, tau.lp = 10, beta1 = 0.1,
      +  beta2 = -0.002, dtype2.p = 3, dtype3.p = 10)
      + +

      Value

      + +

      A list, the first two elements of which ('spp_vis' & 'occDetData') mimic the output of occDetFunc. +The third element ('Z') is the presence-absence state variable and the fourth ('p') is the true probability of detection.

      + + +

      Examples

      +
      # NOT RUN {
      +# set the sparta options
      +sparta_options <- c('ranwalk', # prior on occupancy is set by last year's posterior
      +                   'jul_date', # use the Julian date as a covariate on the detection probability
      +                   'catlistlength', # categorises the visits into three sets of 'qualities'
      +                   'halfcauchy') # prior on the precisions
      +
      +# simulate some data
      +mydata <- simOccData(nvisit=200, nsite=10, nTP=5, psi=0.5, beta1=0.1, beta2=-2e-3)
      +with(mydata, plot(occDetdata$Jul_date, p))
      +
      +# run the occupancy model model
      +out <- occDetFunc('mysp', mydata$occDetdata, mydata$spp_vis, n_iter = 1e4,
      +                 modeltype = sparta_options, return_data=TRUE)
      +
      +out$BUGSoutput
      +detection_phenology(out)
      +
      +qplot(data=melt(out$BUGSoutput$sims.array), geom='line',
      +     x=Var1, col=factor(Var2), y=value) +
      + facet_wrap(~Var3, ncol=4, scales='free')
      +
      +# }
      +
      + +
      + +
      + + +
      +

      Site built with pkgdown 1.3.0.

      +
      +
      +
      + + + + + + diff --git a/docs/reference/siteSelection.html b/docs/reference/siteSelection.html index 4c4fc3a..c090971 100644 --- a/docs/reference/siteSelection.html +++ b/docs/reference/siteSelection.html @@ -1,32 +1,48 @@ - + -Site selection method — siteSelection • sparta +Site selection method — siteSelection • sparta - + - - + + - + + + + + + - - - + + + + + + + + + + + + - + + @@ -84,20 +106,24 @@ -
      +
      +

      This function uses the method outlined in Roy et al (2012) and Isaac et al (2014) for selecting well-sampled sites from a dataset using list length and number of years as selection criteria.

      +
      siteSelection(taxa, site, time_period, minL, minTP, LFirst = TRUE)
      -

      Arguments

      +

      Arguments

    @@ -146,31 +172,31 @@

    Examp nSamples <- 20 # set number of dates # Create somes dates -first <- as.POSIXct(strptime("2003/01/01", "%Y/%m/%d")) -last <- as.POSIXct(strptime(paste(2003+(nyr-1),"/12/31", sep=''), "%Y/%m/%d")) +first <- as.POSIXct(strptime("2003/01/01", "%Y/%m/%d")) +last <- as.POSIXct(strptime(paste(2003+(nyr-1),"/12/31", sep=''), "%Y/%m/%d")) dt <- last-first -rDates <- first + (runif(nSamples)*dt) +rDates <- first + (runif(nSamples)*dt) # taxa are set as random letters -taxa <- sample(letters, size = n, TRUE) +taxa <- sample(letters, size = n, TRUE) # three sites are visited randomly -site <- sample(c('one', 'two', 'three'), size = n, TRUE) +site <- sample(c('one', 'two', 'three'), size = n, TRUE) # the date of visit is selected at random from those created earlier -time_period <- sample(rDates, size = n, TRUE) +time_period <- sample(rDates, size = n, TRUE) # combine this to a dataframe -df <- data.frame(taxa, site, time_period) -head(df)
    #> taxa site time_period -#> 1 w three 2006-03-25 19:46:36 -#> 2 o three 2006-03-25 19:46:36 -#> 3 g one 2008-04-12 22:49:43 -#> 4 x three 2009-09-06 16:34:04 -#> 5 w two 2007-01-01 01:25:29 -#> 6 g two 2009-09-06 16:34:04
    +df <- data.frame(taxa, site, time_period) +head(df)
    #> taxa site time_period +#> 1 f one 2004-10-14 08:25:06 +#> 2 d three 2004-10-16 02:35:43 +#> 3 p two 2008-01-15 15:39:28 +#> 4 p one 2003-05-10 08:34:09 +#> 5 d three 2007-06-20 08:15:56 +#> 6 j one 2004-11-28 21:52:20
    # Use the site selection function on this simulated data -dfSEL <- siteSelection(df$taxa, df$site, df$time_period, minL = 4, minTP = 3)
    #> Warning: 8 out of 150 observations will be removed as duplicates
    +dfSEL <- siteSelection(df$taxa, df$site, df$time_period, minL = 4, minTP = 3)
    #> Warning: 9 out of 150 observations will be removed as duplicates
    + + + diff --git a/docs/reference/siteSelectionMinL.html b/docs/reference/siteSelectionMinL.html index 620f58f..21868b9 100644 --- a/docs/reference/siteSelectionMinL.html +++ b/docs/reference/siteSelectionMinL.html @@ -1,32 +1,49 @@ - + -List-length site selection — siteSelectionMinL • sparta +List-length site selection — siteSelectionMinL • sparta - + - - + + - + + + + + + - - - + + + + + + + + + + + + - + + @@ -84,21 +107,25 @@ -
    +
    +

    This function uses part of the method outlined in Roy et al (2012) and Isaac et al (2014) for selecting well-sampled sites from a dataset using list length only. siteSelection is a wrapper for this function that performs the complete site selection process as outlined in these papers.

    +
    siteSelectionMinL(taxa, site, time_period, minL)
    -

    Arguments

    +

    Arguments

    @@ -147,15 +174,17 @@

    Contents

    -

    Site built with pkgdown.

    +

    Site built with pkgdown 1.3.0.

    -
    + + + diff --git a/docs/reference/siteSelectionMinTP.html b/docs/reference/siteSelectionMinTP.html index 2ed5778..4d5f57b 100644 --- a/docs/reference/siteSelectionMinTP.html +++ b/docs/reference/siteSelectionMinTP.html @@ -1,32 +1,49 @@ - + -Minimum time-period site selection — siteSelectionMinTP • sparta +Minimum time-period site selection — siteSelectionMinTP • sparta - + - - + + - + + + + + + - - - + + + + + + + + + + + + - + + @@ -84,21 +107,25 @@ -
    +
    +

    This function uses part of the method outlined in Roy et al (2012) and Isaac et al (2014) for selecting well-sampled sites from a dataset using the number of time periods only. siteSelection is a wrapper for this function that performs the complete site selection process as outlined in these papers.

    +
    siteSelectionMinTP(taxa, site, time_period, minTP)
    -

    Arguments

    +

    Arguments

    @@ -147,15 +174,17 @@

    Contents

    -

    Site built with pkgdown.

    +

    Site built with pkgdown 1.3.0.

    -
    + + + diff --git a/docs/reference/sparta.html b/docs/reference/sparta.html index d761dc7..0f8f58d 100644 --- a/docs/reference/sparta.html +++ b/docs/reference/sparta.html @@ -1,32 +1,51 @@ - + -<span class="pkg">sparta</span> Trend Analysis for Unstructured Data — sparta • sparta +<span class="pkg">sparta</span> Trend Analysis for Unstructured Data — sparta • sparta - + - - + + - + + + + + + - - - + + + +sparta Trend Analysis for Unstructured Data — sparta" /> + + + + + + + + - + + @@ -84,12 +109,15 @@ -
    +
    +

    The Sparta package includes methods used to analyse trends in unstructured occurrence datasets. Methods included in the package @@ -97,6 +125,7 @@

    sparta Trend Analysis for Unstructured Data

    and Bayesian Occupancy models. These methods are reviewed in Issac et al (2014), available at http://onlinelibrary.wiley.com/doi/10.1111/2041-210X.12254/abstract

    +
    @@ -111,15 +140,17 @@

    Contents

    -

    Site built with pkgdown.

    +

    Site built with pkgdown 1.3.0.

    -
    + + + diff --git a/docs/reference/telfer.html b/docs/reference/telfer.html index 1d5fd6f..0c3606d 100644 --- a/docs/reference/telfer.html +++ b/docs/reference/telfer.html @@ -1,32 +1,49 @@ - + -Telfer's change index — telfer • sparta +Telfer's change index — telfer • sparta - + - - + + - + + + + + + - - - + + + + + + + + + + + + - + +
    @@ -84,22 +107,26 @@ -
    +
    +

    Telfers change index is designed to assess the relative change in range size of species between two time periods (Telfer et al 2002). This function can take multiple time periods and will complete all pairwise comparisons.

    +
    telfer(taxa, site, time_period, minSite = 5, useIterations = TRUE,
       iterations = 10)
    -

    Arguments

    +

    Arguments

    @@ -145,24 +172,24 @@

    Examp
    # Create fake data SS <- 5000 # number of observations -taxa <- sample(letters, SS, replace = TRUE) -site <- sample(paste('A', 1:20, sep = ''), SS, replace = TRUE) -time_period <- sample(1:3, SS, replace = TRUE) - -TelferResult <- telfer(taxa, site, time_period)
    #> Warning: 3496 out of 5000 observations will be removed as duplicates
    head(TelferResult)
    #> taxa Nsite_1.x Nsite_2.x Telfer_1_2 Nsite_1.y Nsite_3.x Telfer_1_3 Nsite_2.y -#> 1 a 17 20 0.4178907 17 20 0.9430450 20 -#> 2 b 19 20 0.7447182 19 18 -1.5673369 20 -#> 3 c 19 18 -1.8782816 19 19 -0.7439658 18 -#> 4 d 19 20 0.7447182 19 19 -0.7439658 20 -#> 5 e 20 20 1.1633923 20 16 -1.9417249 20 -#> 6 f 19 20 0.7447182 19 20 0.9344623 20 +taxa <- sample(letters, SS, replace = TRUE) +site <- sample(paste('A', 1:20, sep = ''), SS, replace = TRUE) +time_period <- sample(1:3, SS, replace = TRUE) + +TelferResult <- telfer(taxa, site, time_period)
    #> Warning: 3493 out of 5000 observations will be removed as duplicates
    head(TelferResult)
    #> taxa Nsite_1.x Nsite_2.x Telfer_1_2 Nsite_1.y Nsite_3.x Telfer_1_3 Nsite_2.y +#> 1 a 20 19 -0.5140155 20 19 -0.4521761 19 +#> 2 b 20 19 -0.5140155 20 19 -0.4521761 19 +#> 3 c 19 20 0.6886459 19 19 -0.7970458 20 +#> 4 d 19 20 0.6886459 19 20 1.1535228 20 +#> 5 e 19 18 -1.6642860 19 19 -0.7970458 18 +#> 6 f 18 18 -1.9636241 18 19 -1.0475778 18 #> Nsite_3.y Telfer_2_3 -#> 1 20 1.1440416 -#> 2 18 -1.0763359 -#> 3 19 -1.1171519 -#> 4 19 -0.3455839 -#> 5 16 -1.9870010 -#> 6 20 1.1440416
    +#> 1 19 -0.5159189 +#> 2 19 -0.5159189 +#> 3 19 -0.8173482 +#> 4 20 1.0717983 +#> 5 19 -0.3782280 +#> 6 19 -0.3782280
    + + + diff --git a/docs/reference/unicorns.html b/docs/reference/unicorns.html index a53c945..dadf2d2 100644 --- a/docs/reference/unicorns.html +++ b/docs/reference/unicorns.html @@ -1,32 +1,47 @@ - + -A fictional dataset of unicorn sightings — unicorns • sparta +A fictional dataset of unicorn sightings — unicorns • sparta - + - - + + - + + + + + + - - - + + + + + + + + + + + + - + + @@ -84,15 +105,19 @@ -
    +
    +

    This is a fictional occurrence dataset of 70 species of unicorn in the UK.

    +
    unicorns
    @@ -104,23 +129,23 @@

    Contents

    Author

    - -Tom August, 2015-07-01 - +

    Tom August, 2015-07-01

    -

    Site built with pkgdown.

    +

    Site built with pkgdown 1.3.0.

    -
    + + + diff --git a/inst/doc/sparta_vignette.html b/inst/doc/sparta_vignette.html new file mode 100644 index 0000000..1926542 --- /dev/null +++ b/inst/doc/sparta_vignette.html @@ -0,0 +1,1289 @@ + + + + + + + + + + + + + + +sparta - Species Presence Absence R Trends Analyses + + + + + + + + + + + + + + + + + + + + + +

    sparta - Species Presence Absence R Trends Analyses

    + + + +
    +

    Introduction

    +

    Sparta provides a range of tools for analysing trends in species occurrence data and is based on the work presented in Isaac et al (2014). The data that is used in these method is ‘what where and when’. The ‘what’ is typically a species name. ‘Where’ is the location of the observation, sometimes referred to as the site. This is typically a 1km, 2km or 10km grid square but could also be a none regular location such as field sites or counties. ‘When’ is the time when an observation is made, and the requirements differ between methods. Some methods require a date while others require you to aggregate dates into time periods for comparison.

    +

    All of the methods described here require multi species data. This is because they use information across all species to assess biases.

    +

    In this vignette we will run through the methods and show how they can be used in reproducible examples.

    +
    +

    Installation

    +

    Installing the package is easy and can be done from CRAN. Alternatively the development version can be installed from GitHub.

    +

    NOTE: JAGS must be installed before the R package installation will work. JAGS can be found here - http://sourceforge.net/projects/mcmc-jags/files/JAGS/

    +
    # Install the package from CRAN
    +# THIS WILL WORK ONLY AFTER THE PACKAGE IS PUBLISHED
    +install.packages('sparta')
    +
    +# Or install the development version from GitHub
    +library(devtools)
    +install_github('biologicalrecordscentre/sparta')
    +
    # Once installed, load the package
    +library(sparta)
    +
    ## Loading required package: lme4
    +## Loading required package: Matrix
    +## Loading required package: Rcpp
    +

    The functions in sparta cover a range of tasks. Primarily they are focused on analysing trends in species occurrence data while accounting for biases (see Isaac et al, 2014). In this vignette we step through these functions and others so that you can understand how the package works. If you have any questions you can find the package maintainers email address using maintainer('sparta'), and if you have issues or bugs you can report them here

    + +
    +
    +
    +

    Modelling methods

    +
    +

    Create some example data

    +

    Clearly when you are using sparta you will want to use your own data, however perhaps you are only at the planning stage of your project? This code shows you how to create some example data so that you can try out sparta’s functionality.

    +
    # Create data
    +n <- 8000 # size of dataset
    +nyr <- 50 # number of years in data
    +nSamples <- 200 # set number of dates
    +nSites <- 100 # set number of sites
    +set.seed(125) # set a random seed
    +
    +# Create somes dates
    +first <- as.Date(strptime("1950/01/01", "%Y/%m/%d")) 
    +last <- as.Date(strptime(paste(1950+(nyr-1),"/12/31", sep=''), "%Y/%m/%d")) 
    +dt <- last-first 
    +rDates <- first + (runif(nSamples)*dt)
    +
    +# taxa are set semi-randomly
    +taxa_probabilities <- seq(from = 0.1, to = 0.7, length.out = 26)
    +taxa <- sample(letters, size = n, TRUE, prob = taxa_probabilities)
    +
    +# sites are visited semi-randomly
    +site_probabilities <- seq(from = 0.1, to = 0.7, length.out = nSites)
    +site <- sample(paste('A', 1:nSites, sep=''), size = n, TRUE, prob = site_probabilities)
    +
    +# the date of visit is selected semi-randomly from those created earlier
    +time_probabilities <- seq(from = 0.1, to = 0.7, length.out = nSamples)
    +time_period <- sample(rDates, size = n, TRUE, prob = time_probabilities)
    +
    +myData <- data.frame(taxa, site, time_period)
    +
    +# Let's have a look at the my example data
    +head(myData)
    +
    ##   taxa site time_period
    +## 1    r  A51  1970-01-14
    +## 2    v  A87  1980-09-29
    +## 3    e  A56  1996-04-14
    +## 4    z  A28  1959-01-16
    +## 5    r  A77  1970-09-21
    +## 6    x  A48  1990-02-25
    +

    In general this is the format of data you will need for all of the functions in sparta. The taxa and site columns should be characters and the time_period column should ideally be a date but can in some cases be a numeric.

    +

    There are many sources of wildlife observation data including GBIF (Global Biodiversity Information Facility) and the NBN gateway (National Biodiversity Network). Both of these repositories have R packages that will allow you to download this type of data straight into your R session (see rgbif and rnbn for details)

    +
    +
    +

    Assessing the quality of data

    +

    It can be useful to have a look at your data before you do any analyses. For example it is important to understand the biases in your data. The function dataDiagnostics is designed to help with this.

    +
    # Run some data diagnostics on our data
    +results <- dataDiagnostics(taxa = myData$taxa,
    +                           site = myData$site,
    +                           time_period = myData$time_period,
    +                           progress_bar = FALSE)
    +
    ## Warning in errorChecks(taxa = taxa, site = site, time_period =
    +## time_period): 94 out of 8000 observations will be removed as duplicates
    +
    + + +
    +
    ## ## Linear model outputs ##
    +## 
    +## There is no detectable change in the number of records over time:
    +## 
    +##                 Estimate   Std. Error    t value  Pr(>|t|)
    +## (Intercept) -894.8997359 1710.0719088 -0.5233112 0.6031654
    +## time_period    0.5342617    0.8660553  0.6168910 0.5402219
    +## 
    +## 
    +## There is no detectable change in list lengths over time:
    +## 
    +##                 Estimate   Std. Error    z value     Pr(>|z|)
    +## (Intercept) 2.390402e-01 1.208657e-02 19.7773477 4.665954e-87
    +## time_period 1.098369e-06 2.135956e-06  0.5142282 6.070924e-01
    +

    The plot produced shows the number of records for each year in the top plot and the average list length in a box plot at the bottom. List length is the number of taxa observed on a visit to a site, where a visit is taken to be a unique combination of ‘where’ and ‘when’. A trend in the number of observations across time is not uncommon and a formal test for such a trend is performed in the form of a linear model. Trends in the number of records over time are handled by all of the methods presented in sparta in a variety of different ways. Trends in list length are tested in the same manner, and both are returned to the console. A in list length can cause some methods such as the reporting rate methods to fail (see ‘LessEffortPerVisit’ scenario in Isaac et al (2014)) Unsurprisingly, since this is a random dataset, we have no trend in either the number of records or list length over time. This function also works if we have a numeric for time period such as the year

    +
    # Run some data diagnostics on our data, now time_period
    +# is set to be a year
    +results <- dataDiagnostics(taxa = myData$taxa,
    +                           site = myData$site,
    +                           time_period = as.numeric(format(myData$time_period, '%Y')),
    +                           progress_bar = FALSE)
    +
    ## Warning in errorChecks(taxa = taxa, site = site, time_period =
    +## time_period): 419 out of 8000 observations will be removed as duplicates
    +
    + + +
    +
    ## ## Linear model outputs ##
    +## 
    +## There is no detectable change in the number of records over time:
    +## 
    +##                 Estimate   Std. Error    t value  Pr(>|t|)
    +## (Intercept) -894.8997359 1710.0719088 -0.5233112 0.6031654
    +## time_period    0.5342617    0.8660553  0.6168910 0.5402219
    +## 
    +## 
    +## There is no detectable change in list lengths over time:
    +## 
    +##                  Estimate   Std. Error    z value  Pr(>|z|)
    +## (Intercept) -0.6465523185 1.5554513917 -0.4156686 0.6776525
    +## time_period  0.0007201245 0.0007874907  0.9144546 0.3604780
    +

    If we want to view these results in more detail we can interrogate the object results

    +
    # See what is in results..
    +names(results)
    +
    ## [1] "RecordsPerYear"  "VisitListLength" "modelRecs"       "modelList"
    +
    # Let's have a look at the details
    +head(results$RecordsPerYear)
    +
    ## RecordsPerYear
    +## 1950 1951 1952 1953 1954 1955 
    +##  224   69  147  181  119  218
    +
    head(results$VisitListLength)
    +
    ##   time_period site listLength
    +## 1        1950 A100          3
    +## 2        1950  A11          1
    +## 3        1950  A12          2
    +## 4        1950  A13          1
    +## 5        1950  A15          1
    +## 6        1950  A16          2
    +
    summary(results$modelRecs)
    +
    ## 
    +## Call:
    +## glm(formula = count ~ time_period, data = mData)
    +## 
    +## Deviance Residuals: 
    +##     Min       1Q   Median       3Q      Max  
    +## -136.06   -59.03   -22.40    50.51   265.99  
    +## 
    +## Coefficients:
    +##              Estimate Std. Error t value Pr(>|t|)
    +## (Intercept) -894.8997  1710.0719  -0.523    0.603
    +## time_period    0.5343     0.8661   0.617    0.540
    +## 
    +## (Dispersion parameter for gaussian family taken to be 7809.915)
    +## 
    +##     Null deviance: 377848  on 49  degrees of freedom
    +## Residual deviance: 374876  on 48  degrees of freedom
    +## AIC: 594.01
    +## 
    +## Number of Fisher Scoring iterations: 2
    +
    summary(results$modelList)
    +
    ## 
    +## Call:
    +## glm(formula = listLength ~ time_period, family = "poisson", data = space_time)
    +## 
    +## Deviance Residuals: 
    +##     Min       1Q   Median       3Q      Max  
    +## -0.9132  -0.8866  -0.1309   0.5260   3.8475  
    +## 
    +## Coefficients:
    +##               Estimate Std. Error z value Pr(>|z|)
    +## (Intercept) -0.6465523  1.5554514  -0.416    0.678
    +## time_period  0.0007201  0.0007875   0.914    0.360
    +## 
    +## (Dispersion parameter for poisson family taken to be 1)
    +## 
    +##     Null deviance: 2737.1  on 3489  degrees of freedom
    +## Residual deviance: 2736.3  on 3488  degrees of freedom
    +## AIC: 11607
    +## 
    +## Number of Fisher Scoring iterations: 5
    +
    +
    +

    Telfer

    +

    Telfer’s change index is designed to assess the relative change in range size of species between two time periods (Telfer et al, 2002). This is a simple method that is robust but has low power to detect trends where they exist. While this method is designed to compare two time period sparta can take many time periods and will complete all pairwise comparisons.

    +

    Our data is not quite in the correct format for Telfer since it is used to compare time periods but our time_period column is a date. We can fix this by using the date2timeperiod function.

    +
    ## Create a new column for the time period
    +# First define my time periods
    +time_periods <- data.frame(start = c(1950, 1960, 1970, 1980, 1990),
    +                           end = c(1959, 1969, 1979, 1989, 1999))
    +
    +time_periods
    +
    ##   start  end
    +## 1  1950 1959
    +## 2  1960 1969
    +## 3  1970 1979
    +## 4  1980 1989
    +## 5  1990 1999
    +
    # Now use these to assign my dates to time periods
    +myData$tp <- date2timeperiod(myData$time_period, time_periods)
    +
    +head(myData)
    +
    ##   taxa site time_period tp
    +## 1    r  A51  1970-01-14  3
    +## 2    v  A87  1980-09-29  4
    +## 3    e  A56  1996-04-14  5
    +## 4    z  A28  1959-01-16  1
    +## 5    r  A77  1970-09-21  3
    +## 6    x  A48  1990-02-25  5
    +

    As you can see our new column indicates which time period each date falls into with 1 being the earliest time period, 2 being the second and so on. This function will also work if instead of a single date for each record you have a date range

    +
    ## Create a dataset where we have date ranges
    +Date_range <- data.frame(startdate = myData$time_period,
    +                         enddate = (myData$time_period + 600))
    +
    +head(Date_range)
    +
    ##    startdate    enddate
    +## 1 1970-01-14 1971-09-06
    +## 2 1980-09-29 1982-05-22
    +## 3 1996-04-14 1997-12-05
    +## 4 1959-01-16 1960-09-07
    +## 5 1970-09-21 1972-05-13
    +## 6 1990-02-25 1991-10-18
    +
    # Now assign my date ranges to time periods
    +Date_range$time_period <- date2timeperiod(Date_range, time_periods)
    +
    +head(Date_range)
    +
    ##    startdate    enddate time_period
    +## 1 1970-01-14 1971-09-06           3
    +## 2 1980-09-29 1982-05-22           4
    +## 3 1996-04-14 1997-12-05           5
    +## 4 1959-01-16 1960-09-07          NA
    +## 5 1970-09-21 1972-05-13           3
    +## 6 1990-02-25 1991-10-18           5
    +

    As you can see in this example when a date range spans the boundaries of your time periods NA is returned.

    +

    Now we have our data in the right format we can use the telfer function to analyse the data. The Telfer index for each species is the standardized residual from a linear regression across all species and is a measure of relative change only as the average real trend across species is obscured (Isaac et al (2014); Telfer et al, 2002).Telfer is used for comparing two time periods and if you have more than this the telfer function will all pair-wise comparisons.

    +
    # Here is our data
    +head(myData)
    +
    ##   taxa site time_period tp
    +## 1    r  A51  1970-01-14  3
    +## 2    v  A87  1980-09-29  4
    +## 3    e  A56  1996-04-14  5
    +## 4    z  A28  1959-01-16  1
    +## 5    r  A77  1970-09-21  3
    +## 6    x  A48  1990-02-25  5
    +
    telfer_results <- telfer(taxa = myData$taxa,
    +                         site = myData$site,
    +                         time_period = myData$tp,
    +                         minSite = 2)
    +
    ## Warning in errorChecks(taxa = taxa, site = site, time_period =
    +## time_period, : 2541 out of 8000 observations will be removed as duplicates
    +

    We get a warning message indicating that a large number of rows are being removed as duplicates. This occurs since we are now aggregating records into time periods and therefore creating a large number of duplicates.

    +

    The results give the change index for each species (rows) in each of the pairwise comparisons of time periods (columns).

    +
    head(telfer_results)
    +
    ##   taxa  Telfer_1_2   Telfer_1_3 Telfer_1_4 Telfer_1_5 Telfer_2_3
    +## 1    a -0.67842545 -1.744577671 -1.8073843 -0.7000801 -1.8352888
    +## 2    b -0.90368128 -0.841219630 -0.8697828 -1.5449132 -0.5139840
    +## 3    c  0.96096754 -0.008737329  0.2181534  0.3726534 -0.7254485
    +## 4    d  0.79744179 -0.558165922  0.3848417  1.6642357 -1.1759409
    +## 5    e -0.01856808  0.490523483 -1.0901348 -1.6500473  0.3450083
    +## 6    f -0.80201507 -0.412461197 -1.0846426  0.3817399  0.1657078
    +##   Telfer_2_4 Telfer_2_5 Telfer_3_4 Telfer_3_5 Telfer_4_5
    +## 1 -2.1097232 -0.4557972 -1.1728237  0.8437536  1.4880569
    +## 2 -0.6234749 -0.8326960 -0.3171487 -1.1756988 -0.8995878
    +## 3 -0.3891040 -0.3595835  0.3549603 -0.2184517 -0.3834038
    +## 4 -0.1875890  0.5294236  1.2663488  1.3562488  0.6466352
    +## 5 -1.1254544 -1.7153826 -1.8881411 -2.1972910 -1.0810351
    +## 6 -0.5122655  0.8827473 -0.8383498  0.4662370  1.3111555
    +
    +
    +

    Reporting Rate Models

    +

    The reporting rates models in sparta are all either GLMs or GLMMs with year as a continuous covariate but are flexible, giving the user a number of options for their analyses. These options include the addition of covariates to account for biases in the data including a random site effect and fixed effect of list length.

    +

    In Isaac et al (2014) it was shown that reporting rate models can be susceptible to type 1 errors under certain scenarios and that with site and list length covariates the models performed better when the data were bias. These methods were found to out perform simple methods like Telfer.

    +

    The common feature among these models is that the quantity under consideration is the ‘probability of being recorded’. When binomial models are used (as is the default), it’s the ‘probability for an average visit’ for the Bernoulli version it is the probability of being recorded per time period.

    +
    +

    Data selection

    +

    Before undertaking modelling the data can be subset in an effort to remove data that may introduce bias. Model sub-setting was found to reduce power in Isaac et al (2014) but can partially deal with uneven sampling of site. This process can also be used with other methods and is not solely applicable to the reporting rate models.

    +

    The first function allows you to subset your data by list length. This works out, for each combination of ‘where’ and ‘when’ (a visit), the number of species observed (list length). Any records that to not come from a list that meets your list length criteria are then dropped.

    +
    # Select only records which occur on lists of length 2 or more
    +myDataL <- siteSelectionMinL(taxa = myData$taxa,
    +                             site = myData$site,
    +                             time_period = myData$time_period,
    +                             minL = 2) 
    +
    ## Warning in errorChecks(taxa, site, time_period): 94 out of 8000
    +## observations will be removed as duplicates
    +
    head(myDataL)
    +
    ##   taxa site time_period
    +## 1    u   A1  1952-11-16
    +## 2    n   A1  1952-11-16
    +## 3    x   A1  1960-06-06
    +## 4    s   A1  1960-06-06
    +## 5    x   A1  1999-08-03
    +## 6    d   A1  1999-08-03
    +
    # We now have a much smaller dataset after subsetting
    +nrow(myData)
    +
    ## [1] 8000
    +
    nrow(myDataL)
    +
    ## [1] 3082
    +

    We are also able to subset by the number of times a site is sampled. The function siteSelectionMinTP does this. When time_period is a date, as in this case, minTP is minimum number of years a site must be sampled in for it be included in the subset.

    +
    # Select only data from sites sampled in at least 10 years
    +myDataTP <- siteSelectionMinTP(taxa = myData$taxa,
    +                               site = myData$site,
    +                               time_period = myData$time_period,
    +                               minTP = 10) 
    +
    ## Warning in errorChecks(taxa, site, time_period): 94 out of 8000
    +## observations will be removed as duplicates
    +
    head(myDataTP)
    +
    ##   taxa site time_period
    +## 1    r  A51  1970-01-14
    +## 2    v  A87  1980-09-29
    +## 3    e  A56  1996-04-14
    +## 4    z  A28  1959-01-16
    +## 5    r  A77  1970-09-21
    +## 6    x  A48  1990-02-25
    +
    # Here we have only lost a small number rows, this is because
    +# many sites in our data are visited in a lot of years. Those
    +# rows that have been removed are duplicates
    +nrow(myData)
    +
    ## [1] 8000
    +
    nrow(myDataTP)
    +
    ## [1] 7906
    +

    As you can see in the above example minTP specifies the number of years a site must be sampled in order to be included. However, our dataset is very well sampled so we might be interested in another measure of time. For example, you might want only sites that have been observed in at least 60 months. Let’s see how this could be done.

    +
    # We need to create a new column to represent unique months
    +# this could also be any unit of time you wanted (week, decade, etc.)
    +
    +# This line returns a unique character for each month
    +unique_Months <- format(myData$time_period, "%B_%Y")
    +head(unique_Months)
    +
    ## [1] "January_1970"   "September_1980" "April_1996"     "January_1959"  
    +## [5] "September_1970" "February_1990"
    +
    # Week could be done like this, see ?strptime for more details
    +unique_Weeks <- format(myData$time_period, "%U_%Y")
    +head(unique_Weeks)
    +
    ## [1] "02_1970" "39_1980" "15_1996" "02_1959" "38_1970" "08_1990"
    +
    # Now lets subset to records found on 60 months or more
    +myData60Months <- siteSelectionMinTP(taxa = myData$taxa,
    +                                     site = myData$site,
    +                                     time_period = unique_Months,
    +                                     minTP = 60) 
    +
    ## Warning in errorChecks(taxa, site, time_period): 129 out of 8000
    +## observations will be removed as duplicates
    +
    head(myData60Months)
    +
    ##   taxa site    time_period
    +## 1    r  A51   January_1970
    +## 2    v  A87 September_1980
    +## 3    e  A56     April_1996
    +## 5    r  A77 September_1970
    +## 6    x  A48  February_1990
    +## 7    t  A59   January_1981
    +
    # We could merge this back with our original data if
    +# we need to retain the full dates
    +myData60Months <- merge(myData60Months, myData$time_period, 
    +                        all.x = TRUE, all.y = FALSE,
    +                        by = "row.names")
    +head(myData60Months)
    +
    ##   Row.names taxa site  time_period          y
    +## 1         1    r  A51 January_1970 1970-01-14
    +## 2        10    w  A81    June_1982 1982-06-19
    +## 3       100    v  A91 January_1996 1996-01-29
    +## 4      1000    h  A94     May_1990 1981-01-17
    +## 5      1001    m  A73   March_1999 1990-05-18
    +## 6      1002    b  A59    July_1997 1999-03-05
    +
    nrow(myData)
    +
    ## [1] 8000
    +
    nrow(myData60Months)
    +
    ## [1] 5289
    +

    Following the method in Roy et al (2012) we can combine these two functions to subset both by the length of lists and by the number of years that sites are sampled. This has been wrapped up in to the function siteSelection which takes all the arguments of the previous two functions plus the argument LFirst which indicates whether the data should be subset by list length first (TRUE) or second (FALSE).

    +
    # Subset our data as above but in one go
    +myDataSubset  <- siteSelection(taxa = myData$taxa,
    +                               site = myData$site,
    +                               time_period = myData$time_period,
    +                               minL = 2,
    +                               minTP = 10,
    +                               LFirst = TRUE)
    +
    ## Warning in errorChecks(taxa, site, time_period): 94 out of 8000
    +## observations will be removed as duplicates
    +
    head(myDataSubset)
    +
    ##    taxa site time_period
    +## 11    y A100  1950-01-04
    +## 12    k A100  1950-01-04
    +## 13    l A100  1954-01-30
    +## 14    o A100  1954-01-30
    +## 15    s A100  1954-01-30
    +## 16    m A100  1956-02-02
    +
    nrow(myDataSubset)
    +
    ## [1] 2587
    +
    +
    +

    Running Reporting Rate Models

    +

    Once you have subset your data using the above functions (or perhaps not at all) the reporting rate models can be applied using the function reportingRateModel. This function offers flexibility in the model you wish to fit, allowing the user to specify whether list length and site should be used as covariates, whether over-dispersion should be used, and whether the family should be binomial or Bernoulli. A number of these variants are presented in Isaac et al (2014). While multi-species data is required it is not nessecary to model all species. In fact you can save a significant amount of time by only modelling hte species you are interested in.

    +
    # Run the reporting rate model using list length as a fixed effect and 
    +# site as a random effect. Here we only model a few species.
    +system.time({
    +RR_out <- reportingRateModel(taxa = myData$taxa,
    +                             site = myData$site,
    +                             time_period = myData$time_period,
    +                             list_length = TRUE,
    +                             site_effect = TRUE,
    +                             species_to_include = c('e','u','r','o','t','a','s'),
    +                             overdispersion = FALSE,
    +                             family = 'Bernoulli',
    +                             print_progress = TRUE)
    +})
    +
    ## Warning in errorChecks(taxa = taxa, site = site, time_period =
    +## time_period, : 94 out of 8000 observations will be removed as duplicates
    +
    ## Modelling e - Species 1 of 7 
    +## Modelling u - Species 2 of 7 
    +## Modelling r - Species 3 of 7 
    +## Modelling o - Species 4 of 7 
    +## Modelling t - Species 5 of 7 
    +## Modelling a - Species 6 of 7 
    +## Modelling s - Species 7 of 7
    +
    ##    user  system elapsed 
    +##   11.44    0.00   11.46
    +
    # Let's have a look at the data that is returned
    +str(RR_out)
    +
    ## 'data.frame':    7 obs. of  14 variables:
    +##  $ species_name       : Factor w/ 7 levels "e","u","r","o",..: 1 2 3 4 5 6 7
    +##  $ intercept.estimate : num  -4.53 -3.52 -3.32 -3.63 -3.68 ...
    +##  $ year.estimate      : num  -0.005811 -0.006944 -0.003321 0.000264 -0.004033 ...
    +##  $ listlength.estimate: num  0.574 0.702 0.472 0.572 0.717 ...
    +##  $ intercept.stderror : num  0.185 0.113 0.123 0.129 0.116 ...
    +##  $ year.stderror      : num  0.00583 0.00342 0.00359 0.00384 0.00361 ...
    +##  $ listlength.stderror: num  0.1092 0.0659 0.0754 0.0759 0.0683 ...
    +##  $ intercept.zvalue   : num  -24.5 -31.3 -27.1 -28.2 -31.6 ...
    +##  $ year.zvalue        : num  -0.9961 -2.0324 -0.9244 0.0688 -1.1177 ...
    +##  $ listlength.zvalue  : num  5.25 10.65 6.26 7.54 10.49 ...
    +##  $ intercept.pvalue   : num  2.34e-132 1.58e-214 6.06e-162 1.07e-174 3.78e-219 ...
    +##  $ year.pvalue        : num  0.3192 0.0421 0.3553 0.9452 0.2637 ...
    +##  $ listlength.pvalue  : num  1.51e-07 1.68e-26 3.76e-10 4.78e-14 9.57e-26 ...
    +##  $ observations       : num  144 450 398 346 398 73 426
    +##  - attr(*, "intercept_year")= num 1974
    +##  - attr(*, "min_year")= num -24.5
    +##  - attr(*, "max_year")= num 24.5
    +##  - attr(*, "nVisits")= int 6211
    +##  - attr(*, "model_formula")= chr "taxa ~ year + listLength + (1|site)"
    +
    # We could plot these to see the species trends
    +with(RR_out,
    +     # Plot graph
    +     {plot(x = 1:7, y = year.estimate,
    +           ylim = range(c(year.estimate - year.stderror,
    +                          year.estimate + year.stderror)),
    +           ylab = 'Year effect (+/- Std Dev)',
    +           xlab = 'Species',
    +           xaxt = "n")
    +     # Add x-axis with species names
    +     axis(1, at = 1:7, labels = species_name)
    +     # Add the error bars
    +     arrows(1:7, year.estimate - year.stderror,
    +            1:7, year.estimate + year.stderror,
    +            length = 0.05, angle = 90, code = 3)}
    +     )
    +
    + + +
    +

    The returned object is a data frame with one row per species. Each column gives information on an element of the model output including covariate estimates, standard errors and p-values. This object also has some attributes giving the year that was chosen as the intercept, the number of visits in the dataset and the model formula used.

    +

    These models can take a long time to run when your data set is larg or you have a large number of species to model. To make this faster it is possible to parallelise this process across species which can significantly improve your run times. Here is an example of how we would parallelise the above example using hte R package snowfall.

    +
    # Load in snowfall
    +library(snowfall)
    +
    ## Loading required package: snow
    +
    # I have 4 cpus on my PC so I set cpus to 4
    +# when I initialise the cluster
    +sfInit(parallel = TRUE, cpus = 4)
    +
    ## Warning in searchCommandline(parallel, cpus = cpus, type = type,
    +## socketHosts = socketHosts, : Unknown option on commandline:
    +## rmarkdown::render('W:/PYWELL_SHARED/Pywell Projects/BRC/Tom August/R
    +## Packages/Trend analyses/sparta/pre_vignette/sparta_vignette.Rmd', encoding
    +
    ## R Version:  R version 3.2.0 (2015-04-16)
    +
    ## snowfall 1.84-6 initialized (using snow 0.3-13): parallel execution on 4 CPUs.
    +
    # Export my data to the cluster
    +sfExport('myData')
    +
    +# I create a function that takes a species name and runs my models
    +RR_mod_function <- function(taxa_name){
    +  
    +  library(sparta)
    +  
    +  RR_out <- reportingRateModel(species_to_include = taxa_name,
    +                               taxa = myData$taxa,
    +                               site = myData$site,
    +                               time_period = myData$time_period,
    +                               list_length = TRUE,
    +                               site_effect = TRUE,
    +                               overdispersion = FALSE,
    +                               family = 'Bernoulli',
    +                               print_progress = FALSE)  
    +} 
    +
    +# I then run this in parallel
    +system.time({
    +para_out <- sfClusterApplyLB(c('e','u','r','o','t','a','s'), RR_mod_function)
    +})
    +
    ##    user  system elapsed 
    +##    0.00    0.00    7.21
    +
    # Name each element of this output by the species
    +RR_out_combined <- do.call(rbind, para_out)
    +
    +# Stop the cluster
    +sfStop()
    +
    ## 
    +## Stopping cluster
    +
    # You'll see the output is the same as when we did it serially but the
    +# time taken is shorter. Using a cluster computer with many more than 
    +# 4 cores can greatly reduce run time.
    +str(RR_out_combined)
    +
    ## 'data.frame':    7 obs. of  14 variables:
    +##  $ species_name       : Factor w/ 7 levels "e","u","r","o",..: 1 2 3 4 5 6 7
    +##  $ intercept.estimate : num  -4.53 -3.52 -3.32 -3.63 -3.68 ...
    +##  $ year.estimate      : num  -0.005811 -0.006944 -0.003321 0.000264 -0.004033 ...
    +##  $ listlength.estimate: num  0.574 0.702 0.472 0.572 0.717 ...
    +##  $ intercept.stderror : num  0.185 0.113 0.123 0.129 0.116 ...
    +##  $ year.stderror      : num  0.00583 0.00342 0.00359 0.00384 0.00361 ...
    +##  $ listlength.stderror: num  0.1092 0.0659 0.0754 0.0759 0.0683 ...
    +##  $ intercept.zvalue   : num  -24.5 -31.3 -27.1 -28.2 -31.6 ...
    +##  $ year.zvalue        : num  -0.9961 -2.0324 -0.9244 0.0688 -1.1177 ...
    +##  $ listlength.zvalue  : num  5.25 10.65 6.26 7.54 10.49 ...
    +##  $ intercept.pvalue   : num  2.34e-132 1.58e-214 6.06e-162 1.07e-174 3.78e-219 ...
    +##  $ year.pvalue        : num  0.3192 0.0421 0.3553 0.9452 0.2637 ...
    +##  $ listlength.pvalue  : num  1.51e-07 1.68e-26 3.76e-10 4.78e-14 9.57e-26 ...
    +##  $ observations       : num  144 450 398 346 398 73 426
    +##  - attr(*, "intercept_year")= num 1974
    +##  - attr(*, "min_year")= num -24.5
    +##  - attr(*, "max_year")= num 24.5
    +##  - attr(*, "nVisits")= int 6211
    +##  - attr(*, "model_formula")= chr "taxa ~ year + listLength + (1|site)"
    +

    Using these functions it is possible to recreate the ‘Well-sampled sites’ method that is presented in Roy et al (2012) and Thomas et al (2015). This is made available in the function WSS which is a simple wrapper around siteSelection and reportingratemodel. In this variant the data is subset by list length and the number of years each site was sampled before being run in a GLMM with site as a random effect.

    +
    # Run our data through the well-sampled sites function
    +# This time we run all species
    +WSS_out <- WSS(taxa = myData$taxa,
    +               site = myData$site,
    +               time_period = myData$time_period,
    +               minL = 2,
    +               minTP = 10,
    +               print_progress = FALSE)
    +
    ## Warning in errorChecks(taxa, site, time_period): 94 out of 8000
    +## observations will be removed as duplicates
    +
    # The data is returned in the same format as from reportingRateModel
    +str(WSS_out)
    +
    ## 'data.frame':    26 obs. of  10 variables:
    +##  $ species_name      : Factor w/ 26 levels "r","v","e","z",..: 1 2 3 4 5 6 7 8 9 10 ...
    +##  $ intercept.estimate: num  -2.29 -1.85 -3.17 -1.81 -1.75 ...
    +##  $ year.estimate     : num  -0.00912 0.0012 0.00158 0.00143 -0.00247 ...
    +##  $ intercept.stderror: num  0.1021 0.0861 0.1875 0.0848 0.0829 ...
    +##  $ year.stderror     : num  0.00684 0.00574 0.00973 0.00565 0.00554 ...
    +##  $ intercept.zvalue  : num  -22.4 -21.5 -16.9 -21.3 -21.1 ...
    +##  $ year.zvalue       : num  -1.334 0.208 0.163 0.253 -0.446 ...
    +##  $ intercept.pvalue  : num  1.70e-111 1.66e-102 6.55e-64 6.87e-101 1.06e-98 ...
    +##  $ year.pvalue       : num  0.182 0.835 0.871 0.8 0.656 ...
    +##  $ observations      : num  106 157 50 163 171 148 125 155 61 104 ...
    +##  - attr(*, "intercept_year")= num 1974
    +##  - attr(*, "min_year")= num -24.5
    +##  - attr(*, "max_year")= num 24.5
    +##  - attr(*, "nVisits")= int 1155
    +##  - attr(*, "model_formula")= chr "cbind(successes, failures) ~ year + (1|site)"
    +##  - attr(*, "minL")= num 2
    +##  - attr(*, "minTP")= num 10
    +
    # We can plot these and see that we get different results to our
    +# previous analysis since this time the method includes subsetting
    +with(WSS_out[1:10,],
    +     # Plot graph
    +     {plot(x = 1:10, y = year.estimate,
    +           ylim = range(c(year.estimate - year.stderror,
    +                          year.estimate + year.stderror)),
    +           ylab = 'Year effect (+/- Std Dev)',
    +           xlab = 'Species',
    +           xaxt="n")
    +     # Add x-axis with species names
    +     axis(1, at=1:10, labels = species_name[1:10])
    +     # Add the error bars
    +     arrows(1:10, year.estimate - year.stderror,
    +            1:10, year.estimate + year.stderror,
    +            length=0.05, angle=90, code=3)}
    +     )
    +
    + + +
    +
    +
    +
    +

    Occupancy models

    +

    Occupancy models were found by Isaac et al (2014) to be one of the best tools for analysing species occurrence data typical of citizen science projects, being both robust and powerful. This method models the occupancy process separately from the detection process, but we will not go in to the details of the model here since there is a growing literature about occupancy models, how and when they should be used. Here we focus on how the occupancy model discussed in Isaac et al 2014 is implemented in sparta.

    +

    This function works in a very similar fashion to that of the previous functions we have discussed. The data it takes is ‘What, where, when’ as in other functions, however here we have the option to specify which species we wish to model. This feature has been added as occupancy models are computationally intensive. The parameters of the function allow you control over the number of iterations, burnin, thinning, the number of chains, the seed and for advanced users there is also the possibility to pass in your own BUGS script.

    +
    # Here is our data
    +str(myData)
    +
    ## 'data.frame':    8000 obs. of  4 variables:
    +##  $ taxa       : Factor w/ 26 levels "a","b","c","d",..: 18 22 5 26 18 24 20 24 17 23 ...
    +##  $ site       : Factor w/ 100 levels "A1","A10","A100",..: 48 87 53 22 76 44 56 66 92 81 ...
    +##  $ time_period: Date, format: "1970-01-14" "1980-09-29" ...
    +##  $ tp         : int  3 4 5 1 3 5 4 1 5 4 ...
    +
    # Run an occupancy model for three species
    +# Here we use very small number of iterations 
    +# to avoid a long run time
    +system.time({
    +occ_out <- occDetModel(taxa = myData$taxa,
    +                       site = myData$site,
    +                       time_period = myData$time_period,
    +                       species_list = c('a','b','c','d'),
    +                       write_results = FALSE,
    +                       n_iterations = 200,
    +                       burnin = 15,
    +                       n_chains = 3,
    +                       thinning = 3,
    +                       seed = 123)
    +})
    +
    ## Warning in errorChecks(taxa = taxa, site = site, time_period =
    +## time_period): 94 out of 8000 observations will be removed as duplicates
    +
    ## 
    +## ###
    +## Modeling a - 1 of 4 taxa
    +
    ## module glm loaded
    +
    ## Compiling model graph
    +##    Resolving undeclared variables
    +##    Allocating nodes
    +##    Graph Size: 64272
    +## 
    +## Initializing model
    +## 
    +## 
    +## ###
    +## Modeling b - 2 of 4 taxa
    +## Compiling model graph
    +##    Resolving undeclared variables
    +##    Allocating nodes
    +##    Graph Size: 64306
    +## 
    +## Initializing model
    +## 
    +## 
    +## ###
    +## Modeling c - 3 of 4 taxa
    +## Compiling model graph
    +##    Resolving undeclared variables
    +##    Allocating nodes
    +##    Graph Size: 64308
    +## 
    +## Initializing model
    +## 
    +## 
    +## ###
    +## Modeling d - 4 of 4 taxa
    +## Compiling model graph
    +##    Resolving undeclared variables
    +##    Allocating nodes
    +##    Graph Size: 64328
    +## 
    +## Initializing model
    +
    ##    user  system elapsed 
    +##   70.20    0.08   70.53
    +
    # Lets look at the results
    +## The object returned is a list with one element for each species
    +names(occ_out)
    +
    ## [1] "a" "b" "c" "d"
    +
    # Each of these is an object of class 'occDet'
    +class(occ_out$a)
    +
    ## [1] "occDet"
    +
    # Inside these elements is the information of interest
    +names(occ_out$a)
    +
    ## [1] "model"              "BUGSoutput"         "parameters.to.save"
    +## [4] "model.file"         "n.iter"             "DIC"               
    +## [7] "SPP_NAME"           "min_year"           "max_year"
    +
    # Of particular interest to many users will be the summary
    +# data in the BUGSoutput
    +head(occ_out$a$BUGSoutput$summary)
    +
    ##                   mean         sd        2.5%          25%         50%
    +## LL.p         0.2691009  0.3102804  -0.3519707   0.04895419   0.2556833
    +## deviance   649.7160858 51.1319847 544.5524433 614.12892478 666.5255290
    +## fit        281.9522389 89.1997805 190.9883913 219.13112637 247.9393552
    +## fit.new    283.9149866 90.5760094 185.6004878 217.89008520 254.7165849
    +## mean_early   0.3701971  0.1718789   0.1274588   0.25082920   0.3516627
    +## mean_late    0.4114516  0.1241131   0.1808044   0.35083037   0.4000000
    +##                    75%       97.5%     Rhat n.eff
    +## LL.p         0.4929832   0.8780177 0.993288   190
    +## deviance   688.2839296 710.7614446 1.031683    64
    +## fit        316.7603165 499.9895976 1.009975   120
    +## fit.new    319.0321333 504.5251365 1.006007   160
    +## mean_early   0.4283236   0.7837351 1.841981     5
    +## mean_late    0.4591644   0.7623305 1.295740    12
    +
    # We have included a plotting feature for objects of class
    +# occDet which provides a useful visualisation of the trend
    +# in occupancy over time
    +plot(occ_out$a)
    +
    + + +
    +

    He we have run a small example but in reality these models are usually run for many thousands of iterations, making the analysis of more than a handful of species impractical. For those with access to the necessary facilities it is possible to parallelise across species. To do this we use a pair of functions that are used internally by occDetModel. These are formatOccData which is used to format our occurrence data into the format needed by JAGS, and occDetFunc, the function which undertakes the modelling.

    +
    # First format our data
    +formattedOccData <- formatOccData(taxa = myData$taxa,
    +                                  site = myData$site,
    +                                  time_period = myData$time_period)
    +
    ## Warning in errorChecks(taxa = taxa, site = site, time_period =
    +## time_period): 94 out of 8000 observations will be removed as duplicates
    +
    # This is a list of two elements
    +names(formattedOccData)
    +
    ## [1] "spp_vis"    "occDetdata"
    +

    formatOccData returns a list of length 2; the first element ‘spp_vis’ is a data.frame with visit (unique combination of site and time period) in the first column and taxa for all the following columns. Values in taxa columns are either TRUE or FALSE depending on whether they were observed on that visit.

    +
    # Lets have a look at spp_vis
    +head(formattedOccData$spp_vis[,1:5])
    +
    ##            visit     a     b     c     d
    +## 1 A1001950-01-04 FALSE FALSE FALSE FALSE
    +## 2 A1001950-11-01  TRUE FALSE FALSE FALSE
    +## 3 A1001951-08-25 FALSE FALSE FALSE FALSE
    +## 4 A1001951-11-03 FALSE FALSE FALSE FALSE
    +## 5 A1001952-02-07 FALSE FALSE FALSE FALSE
    +## 6 A1001953-02-22 FALSE FALSE FALSE FALSE
    +

    The second element (‘occDetData’) is a data frame giving the site, list length (the number of species observed on a visit) and year for each visit.

    +
    # Lets have a look at occDetData
    +head(formattedOccData$occDetdata)
    +
    ##            visit site L year
    +## 1 A1001950-01-04 A100 2 1950
    +## 3 A1001950-11-01 A100 1 1950
    +## 4 A1001951-08-25 A100 1 1951
    +## 5 A1001951-11-03 A100 1 1951
    +## 6 A1001952-02-07 A100 1 1952
    +## 7 A1001953-02-22 A100 1 1953
    +

    With our data in the correct format this can now go into the modelling function

    +
    # Use the occupancy modelling function to parrellise the process
    +# Here we are going to use the package snowfall
    +library(snowfall)
    +
    +# I have 4 cpus on my PC so I set cpus to 4
    +# when I initialise the cluster
    +sfInit(parallel = TRUE, cpus = 4)
    +
    ## Warning in searchCommandline(parallel, cpus = cpus, type = type,
    +## socketHosts = socketHosts, : Unknown option on commandline:
    +## rmarkdown::render('W:/PYWELL_SHARED/Pywell Projects/BRC/Tom August/R
    +## Packages/Trend analyses/sparta/pre_vignette/sparta_vignette.Rmd', encoding
    +
    ## snowfall 1.84-6 initialized (using snow 0.3-13): parallel execution on 4 CPUs.
    +
    # Export my data to the cluster
    +sfExport('formattedOccData')
    +
    +# I create a function that takes a species name and runs my model
    +occ_mod_function <- function(taxa_name){
    +  
    +  library(sparta)
    +  
    +  occ_out <- occDetFunc(taxa_name = taxa_name,
    +                        n_iterations = 200,
    +                        burnin = 15, 
    +                        occDetdata = formattedOccData$occDetdata,
    +                        spp_vis = formattedOccData$spp_vis,
    +                        write_results = FALSE,
    +                        seed = 123)  
    +} 
    +
    +# I then run this in parallel
    +system.time({
    +para_out <- sfClusterApplyLB(c('a','b','c','d'), occ_mod_function)
    +})
    +
    ##    user  system elapsed 
    +##    0.02    0.01   25.95
    +
    # Name each element of this output by the species
    +for(i in  1:length(para_out)) names(para_out)[i] <- para_out[[i]]$SPP_NAM
    +
    +# Stop the cluster
    +sfStop()
    +
    ## 
    +## Stopping cluster
    +
    # This takes about half the time of the 
    +# serial version we ran earlier, and the resulting object 
    +# is the same (since we set the random seed to be the same
    +# in each)
    +head(para_out$a$BUGSoutput$summary)
    +
    ##                   mean         sd        2.5%          25%         50%
    +## LL.p         0.2691009  0.3102804  -0.3519707   0.04895419   0.2556833
    +## deviance   649.7160858 51.1319847 544.5524433 614.12892478 666.5255290
    +## fit        281.9522389 89.1997805 190.9883913 219.13112637 247.9393552
    +## fit.new    283.9149866 90.5760094 185.6004878 217.89008520 254.7165849
    +## mean_early   0.3708781  0.1715081   0.1307932   0.25000000   0.3466667
    +## mean_late    0.4106272  0.1219015   0.1887431   0.35000000   0.3966667
    +##                    75%       97.5%     Rhat n.eff
    +## LL.p         0.4929832   0.8780177 0.993288   190
    +## deviance   688.2839296 710.7614446 1.031683    64
    +## fit        316.7603165 499.9895976 1.009975   120
    +## fit.new    319.0321333 504.5251365 1.006007   160
    +## mean_early   0.4349904   0.7779150 1.853765     5
    +## mean_late    0.4533333   0.7547538 1.302426    11
    +
    plot(para_out$a)
    +
    + + +
    +

    This same approach can be used on cluster computers, which can have hundreds of processors, to dramatically reduce run times.

    +
    +
    +

    Frescalo

    +

    The frescalo method is outlined in Hill (2012) and is a means to account for both spatial and temporal bias. This method was shown by Isaac et al (2014) to be a good method for data that is aggregated into time periods such as when comparing atlases. The frescalo method is run using a .exe, you will need to download this file by visiting this link - https://github.com/BiologicalRecordsCentre/frescalo. Once you have downloaded the .exe make a note of the directory you have placed it in, we will need that in a moment.

    +

    Again we will assume that your data is in a ‘what, where, when’ format similar to that we used in the previous method:

    +
    head(myData)
    +
    ##   taxa site time_period tp
    +## 1    r  A51  1970-01-14  3
    +## 2    v  A87  1980-09-29  4
    +## 3    e  A56  1996-04-14  5
    +## 4    z  A28  1959-01-16  1
    +## 5    r  A77  1970-09-21  3
    +## 6    x  A48  1990-02-25  5
    +

    Frescalo’s requirements in terms of data structure and types is a little different to that we have seen in other functions. Firstly the entire data.frame is passed in as an argument called Data, and the column names of your various elements (taxa, site, etc) are given as other arguments. Secondly frescalo requires that the ‘when’ component is either a column of year or two columns, one of ‘start date’ and one of ‘end date’. Our data as presented above does not fit into this format so first we must reformat it. In our situation the simplest thing to do is to add a column giving the year. Since frescalo aggregates across time periods (often decades or greater) this loss of temporal resolution is not an issue.

    +
    # Add a year column
    +myData$year <- as.numeric(format(myData$time_period, '%Y'))
    +head(myData)
    +
    ##   taxa site time_period tp year
    +## 1    r  A51  1970-01-14  3 1970
    +## 2    v  A87  1980-09-29  4 1980
    +## 3    e  A56  1996-04-14  5 1996
    +## 4    z  A28  1959-01-16  1 1959
    +## 5    r  A77  1970-09-21  3 1970
    +## 6    x  A48  1990-02-25  5 1990
    +

    Now we have our data in the correct format for frescalo there is one other major component we need, a weights file. You can find out more about the weights file and what it is used for in the original paper (Hill, 2012). In short the weights file outlines the similarity between sites in your dataset. This information is used to weight the analysis of each site accordingly. If you are undertaking this analysis in the UK at 10km square resolution there are some built in weights files you can use. Some of these weights files use the UK landcover map instead of floristic similarity (as used in Hill (2012)). You can find out more about these in the frescalo help file.

    +

    For the sake of demonstration let us assume that you do not have a weights file for your analysis, or that you want to create your own. To create a weights file you need two things, a measure of physical distance between your sites and a measure of similarity. In the original paper this similarity measure was floristic similarity, but it could also be habitat similarity or whatever is relevant for the taxa you are studying. In this example I have a table of distances and of land cover proportions at each site

    +
    # Here is the distance table
    +head(myDistances)
    +
    ##     x   y     dist
    +## 1 A51 A51    0.000
    +## 2 A87 A51 4074.258
    +## 3 A56 A51 6595.711
    +## 4 A28 A51 1531.943
    +## 5 A77 A51 5732.942
    +## 6 A48 A51 2394.873
    +
    # Here is our habitat data
    +head(myHabitatData)
    +
    ##   site grassland  woodland  heathland      urban freshwater
    +## 1  A51 0.1169123 0.1084992 0.28376157 0.37312774 0.11769919
    +## 2  A87 0.1781151 0.1307214 0.35258119 0.26223604 0.07634632
    +## 3  A56 0.2359391 0.1263644 0.25898930 0.13490734 0.24379991
    +## 4  A28 0.3100922 0.1373896 0.20870313 0.28659095 0.05722412
    +## 5  A77 0.2034073 0.4897063 0.05368464 0.01677132 0.23643036
    +## 6  A48 0.2397599 0.1046128 0.34250853 0.13055663 0.18256221
    +
    # With our distance and habitat tables in hand we can
    +# use the createWeights function to build our weights file
    +# I have changed the defualts of dist_sub and sim_sub since
    +# we have a very small example dataset of only 50 sites
    +myWeights <- createWeights(distances = myDistances,
    +                           attributes = myHabitatData,
    +                           dist_sub = 20,
    +                           sim_sub = 10)
    +
    ## Creating similarity distance table...Complete
    +## Creating weights file...
    +## 0%
    +## 10%
    +## 20%
    +## 30%
    +## 40%
    +## 50%
    +## 60%
    +## 70%
    +## 80%
    +## 90%
    +## 100%
    +## Complete
    +
    head(myWeights)
    +
    ##   target neighbour weight
    +## 1    A51        A2 0.0311
    +## 2    A51       A47 0.1150
    +## 3    A51       A49 0.0012
    +## 4    A51       A51 1.0000
    +## 5    A51       A53 0.0160
    +## 6    A51       A62 0.2687
    +

    The createWeights function follows the procedure outlined in Hill (2012) for creating weights and more information can be found in the help file of the function. With our data and weights file we are now ready to proceed with frescalo. As with other functions frescalo can take a range of additional arguments which you can see by entering ?frescalo at the console, here we will do a minimal example.

    +
    # First we need to enter the location where we placed the .exe
    +# In my case I saved it to my documents folder
    +myFrescaloPath <- 'C:/Users/tomaug/Documents/Frescalo_3a_windows.exe'
    +
    +# I then want to set up the time periods I want to analyse
    +# Here I say I want to compare 1980-89 to 1990-99
    +myTimePeriods <- data.frame(start = c(1980, 1990), end = c(1989, 1999))
    +head(myTimePeriods)
    +
    ##   start  end
    +## 1  1980 1989
    +## 2  1990 1999
    +
    # I also need to specify where I want my results to be saved
    +# I'm going to save it in a folder in my working directory
    +myFolder <- '~/myFolder'
    +
    +# Simple run of frescalo
    +frescalo_results <- frescalo(Data = myData, 
    +                             frespath = myFrescaloPath,
    +                             time_periods = myTimePeriods,
    +                             site_col = 'site',
    +                             sp_col = 'taxa',
    +                             year = 'year',
    +                             Fres_weights = myWeights,
    +                             sinkdir = myFolder)
    +
    ## 
    +## SAVING DATA TO FRESCALO WORKING DIRECTORY
    +## ********************
    +## 
    +## 
    +## RUNNING FRESCALO
    +## ********************
    +
    ## Warning in run_fresc_file(fres_data = Data, output_dir = fresoutput,
    +## frescalo_path = frespath, : Your value of phi (0.74) is smaller than the
    +## 98.5 percentile of input phi (0.89). It is reccommended your phi be similar
    +## to this value. For more information see Hill (2011) reference in frescalo
    +## help file
    +
    ## Building Species List - Complete
    +## Outputting Species Results
    +##  Species 1 of 26 - a - 10/07/2015 14:21:05
    +##  Species 2 of 26 - b - 10/07/2015 14:21:05
    +##  Species 3 of 26 - c - 10/07/2015 14:21:05
    +##  Species 4 of 26 - d - 10/07/2015 14:21:05
    +##  Species 5 of 26 - e - 10/07/2015 14:21:05
    +##  Species 6 of 26 - f - 10/07/2015 14:21:05
    +##  Species 7 of 26 - g - 10/07/2015 14:21:05
    +##  Species 8 of 26 - h - 10/07/2015 14:21:05
    +##  Species 9 of 26 - i - 10/07/2015 14:21:05
    +##  Species 10 of 26 - j - 10/07/2015 14:21:05
    +##  Species 11 of 26 - k - 10/07/2015 14:21:05
    +##  Species 12 of 26 - l - 10/07/2015 14:21:05
    +##  Species 13 of 26 - m - 10/07/2015 14:21:05
    +##  Species 14 of 26 - n - 10/07/2015 14:21:05
    +##  Species 15 of 26 - o - 10/07/2015 14:21:05
    +##  Species 16 of 26 - p - 10/07/2015 14:21:05
    +##  Species 17 of 26 - q - 10/07/2015 14:21:05
    +##  Species 18 of 26 - r - 10/07/2015 14:21:05
    +##  Species 19 of 26 - s - 10/07/2015 14:21:05
    +##  Species 20 of 26 - t - 10/07/2015 14:21:05
    +##  Species 21 of 26 - u - 10/07/2015 14:21:05
    +##  Species 22 of 26 - v - 10/07/2015 14:21:05
    +##  Species 23 of 26 - w - 10/07/2015 14:21:05
    +##  Species 24 of 26 - x - 10/07/2015 14:21:05
    +##  Species 25 of 26 - y - 10/07/2015 14:21:05
    +##  Species 26 of 26 - z - 10/07/2015 14:21:05
    +## [1] "frescalo complete"
    +

    We get a warning from this analysis that our value of phi is too low. In this case this is because our simulated data suggests every species is found on every site in our time periods. This is a little unrealistic but should you get a similar warning with your data you might want to consult Hill (2012) and change your input value of phi.

    +

    The object that is returned (frescalo_results in my case) is an object of class frescalo. this means there are a couple of special methods we can use with it.

    +
    # Using 'summary' gives a quick overview of our data
    +# This can be useful to double check that your data was read in correctly
    +summary(frescalo_results)
    +
    ##  Actual numbers in data 
    +##      Number of samples           100 
    +##      Number of species            26 
    +##      Number of time periods        2 
    +##      Number of observations     2239 
    +##      Neighbourhood weights      1000 
    +##      Benchmark exclusions          0 
    +##      Filter locations included     0
    +
    # Using 'print' we get a preview of the results
    +print(frescalo_results)
    +
    ## 
    +## Preview of $paths - file paths to frescalo log, stats, freq, trend .csv files:
    +## 
    +## [1] "~/myFolder/frescalo_150710/Output/Log.txt"           
    +## [2] "~/myFolder/frescalo_150710/Output/Stats.csv"         
    +## [3] "~/myFolder/frescalo_150710/Output/Freq.csv"          
    +## [4] "~/myFolder/frescalo_150710/Output/Trend.csv"         
    +## [5] "~/myFolder/frescalo_150710/Output/Freq_quickload.txt"
    +## 
    +## 
    +## Preview of $trend - trends file, giving the tfactor value for each species at each time period:
    +## 
    +##   Species   Time TFactor StDev  X Xspt Xest N.0.00 N.0.98
    +## 1       a 1984.5   0.544 0.201  8    8    8     92      0
    +## 2       a 1994.5   1.143 0.302 17   17   17     92      0
    +## 3       j 1984.5   1.372 0.237 46   45   45    100      0
    +## 4       j 1994.5   0.702 0.133 35   35   35    100      1
    +## 5       k 1984.5   0.961 0.167 44   43   43    100      0
    +## 6       k 1994.5   0.816 0.144 46   45   45    100      5
    +## 
    +## 
    +## Preview of $stat - statistics for each hectad in the analysis:
    +## 
    +##   Location Loc_no No_spp Phi_in Alpha Wgt_n2 Phi_out Spnum_in Spnum_out
    +## 1       A1      1     11  0.815  0.66   1.58    0.74     11.5       9.8
    +## 2      A10      2     13  0.717  1.14   3.77    0.74     14.9      15.7
    +## 3     A100      3     18  0.828  0.58   3.01    0.74     18.1      14.9
    +## 4      A11      4     16  0.847  0.49   1.81    0.74     16.9      13.3
    +## 5      A12      5     11  0.718  1.15   3.22    0.74     14.5      15.3
    +## 6      A13      6      8  0.681  1.32   3.88    0.74     14.5      16.4
    +##   Iter
    +## 1   15
    +## 2    5
    +## 3    8
    +## 4    9
    +## 5    3
    +## 6    8
    +## 
    +## 
    +## Preview of $freq - rescaled frequencies for each location and species:
    +## 
    +##   Location Species Pres   Freq  Freq1 SDFrq1 Rank Rank1
    +## 1       A1       v    1 0.9778 0.9177 0.1372    1 0.102
    +## 2       A1       k    1 0.9722 0.9046 0.1494    2 0.204
    +## 3       A1       w    1 0.9634 0.8856 0.1659    3 0.305
    +## 4       A1       y    1 0.9563 0.8715 0.1776    4 0.407
    +## 5       A1       x    1 0.9412 0.8440 0.1992    5 0.509
    +## 6       A1       e    1 0.8965 0.7740 0.2491    6 0.611
    +## 
    +## 
    +## Preview of $log - log file:
    +## 
    +##     Number of species            26
    +##     Number of time periods        2
    +##     Number of observations     2239
    +##     Neighbourhood weights      1000
    +##     Benchmark exclusions          0
    +##     Filter locations included     0
    +## 
    +## 
    +##  98.5 percentile of input phi  0.89
    +##  Target value of phi           0.74
    +## 
    +## 
    +## 
    +## 
    +## Preview of $lm_stats - trends in tfactor over time:
    +## 
    +##    SPECIES NAME       b          a b_std_err b_tval b_pval a_std_err
    +## 1       S1    a  0.0599 -118.32755        NA     NA     NA        NA
    +## 12      S2    b  0.0021   -3.27645        NA     NA     NA        NA
    +## 20      S3    c  0.0045   -8.04525        NA     NA     NA        NA
    +## 21      S4    d  0.0365  -71.60625        NA     NA     NA        NA
    +## 22      S5    e -0.0046    9.96270        NA     NA     NA        NA
    +## 23      S6    f  0.0326  -63.82470        NA     NA     NA        NA
    +##    a_tval a_pval adj_r2 r2 F_val F_num_df F_den_df   Ymin   Ymax
    +## 1      NA     NA     NA  1    NA        1        0 1984.5 1994.5
    +## 12     NA     NA     NA  1    NA        1        0 1984.5 1994.5
    +## 20     NA     NA     NA  1    NA        1        0 1984.5 1994.5
    +## 21     NA     NA     NA  1    NA        1        0 1984.5 1994.5
    +## 22     NA     NA     NA  1    NA        1        0 1984.5 1994.5
    +## 23     NA     NA     NA  1    NA        1        0 1984.5 1994.5
    +##          Z_VAL SIG_95
    +## 1   1.65116558  FALSE
    +## 12  0.06358704  FALSE
    +## 20  0.14903522  FALSE
    +## 21  1.17443512  FALSE
    +## 22 -0.17104272  FALSE
    +## 23  1.12652478  FALSE
    +
    # There is a lot of information here and you can read more about
    +# what these data mean by looking at the frescalo help file
    +# The files detailed in paths are also in the object returned
    +frescalo_results$paths
    +
    ## [1] "~/myFolder/frescalo_150710/Output/Log.txt"           
    +## [2] "~/myFolder/frescalo_150710/Output/Stats.csv"         
    +## [3] "~/myFolder/frescalo_150710/Output/Freq.csv"          
    +## [4] "~/myFolder/frescalo_150710/Output/Trend.csv"         
    +## [5] "~/myFolder/frescalo_150710/Output/Freq_quickload.txt"
    +
    names(frescalo_results)
    +
    ## [1] "paths"    "trend"    "stat"     "freq"     "log"      "lm_stats"
    +
    # However we additionally get some model results in our returned object
    +# under '$lm_stats'
    +

    The results from frescalo may seem complex at first and I suggest reading the Value section of the frescalo help file for details. In brief: frescalo_results$paths lists the file paths of the raw data files for $log, $stat, $freq and $trend, in that order. frescalo_results$trend is a data.frame providing the list of time factors (a measure of probability of occurrence relative to benchmark species) for each species-timeperiod. frescalo_results$stat is a data.frame giving details about sites such as estimated species richness. frescalo_results$freq is a data.frame of the species frequencies, that is the probabilities that a species was present at a certain location. frescalo_results$log, a simple report of the console output from the .exe. frescalo_results$lm_stats is a data.frame giving the results of a linear regression of Tfactors for each species when more than two time periods are used. If only 2 time periods are used (as in our example) the linear modeling section of this data.frame is filled with NAs and a z-test is performed instead (results are given in the last columns).

    +
    # Lets look at some results fo the first three species
    +frescalo_results$lm_stats[1:3, c('NAME','Z_VAL','SIG_95')]
    +
    ##    NAME      Z_VAL SIG_95
    +## 1     a 1.65116558  FALSE
    +## 12    b 0.06358704  FALSE
    +## 20    c 0.14903522  FALSE
    +
    # None of these have a significant change using a z-test
    +# Lets look at the raw data
    +frescalo_results$trend[frescalo_results$trend$Species %in% c('a', 'b', 'c'),
    +                       c('Species', 'Time', 'TFactor', 'StDev')]
    +
    ##    Species   Time TFactor StDev
    +## 1        a 1984.5   0.544 0.201
    +## 2        a 1994.5   1.143 0.302
    +## 23       b 1984.5   0.891 0.237
    +## 24       b 1994.5   0.912 0.230
    +## 39       c 1984.5   0.885 0.215
    +## 40       c 1994.5   0.930 0.212
    +
    # We can see from these results that the big standard deviations on 
    +# the tfactor values means there is no real difference between the 
    +# two time periods
    +

    If your data are from the UK and sites are given as grid referenes that there is functionality to plot a simple output of your results

    +
    # This only works with UK grid references
    +# We can load an example dataset from the UK
    +data(unicorns)
    +head(unicorns)
    +
    ##          TO_STARTDATE                Date hectad   kmsq    CONCEPT
    +## 1 1968-08-06 00:00:00 1968-08-06 00:00:00   NZ28   <NA> Species 18
    +## 2 1951-05-12 00:00:00 1951-05-12 00:00:00   SO34   <NA> Species 11
    +## 3 1946-05-06 00:00:00 1946-05-06 00:00:00   SO34 SO3443 Species 11
    +## 4 1980-05-01 00:00:00 1980-05-31 00:00:00   SO34   <NA> Species 11
    +## 5 1829-12-31 23:58:45 1830-12-30 23:58:45   SH48   <NA> Species 11
    +## 6 1981-06-21 00:00:00 1981-06-21 00:00:00   SO37   <NA> Species 11
    +
    # Now run frescalo using hte built in weights file
    +unicorn_results <- frescalo(Data = unicorns, 
    +                            frespath = myFrescaloPath,
    +                            time_periods = myTimePeriods,
    +                            site_col = 'hectad',
    +                            sp_col = 'CONCEPT',
    +                            start_col = 'TO_STARTDATE',
    +                            end_col = 'Date',
    +                            sinkdir = myFolder)
    +
    ## Warning in weights_data_matchup(weights_sites = unpacked$site_names,
    +## data_sites = unique(Data$site)): 1 sites appear your data but are not in
    +## your weights file, these will be ignored
    +
    ## Warning in frescalo(Data = unicorns, frespath = myFrescaloPath,
    +## time_periods = myTimePeriods, : sinkdir already contains frescalo output.
    +## New data saved in ~/myFolder/frescalo_150710(2)
    +
    ## 
    +## SAVING DATA TO FRESCALO WORKING DIRECTORY
    +## ********************
    +## 
    +## 
    +## RUNNING FRESCALO
    +## ********************
    +## 
    +## Building Species List - Complete
    +## Outputting Species Results
    +##  Species 1 of 55 - Species 1 - 10/07/2015 15:40:46
    +##  Species 2 of 55 - Species 10 - 10/07/2015 15:40:46
    +##  Species 3 of 55 - Species 11 - 10/07/2015 15:40:46
    +##  Species 4 of 55 - Species 12 - 10/07/2015 15:40:46
    +##  Species 5 of 55 - Species 13 - 10/07/2015 15:40:46
    +##  Species 6 of 55 - Species 14 - 10/07/2015 15:40:46
    +##  Species 7 of 55 - Species 15 - 10/07/2015 15:40:46
    +##  Species 8 of 55 - Species 16 - 10/07/2015 15:40:46
    +##  Species 9 of 55 - Species 17 - 10/07/2015 15:40:46
    +##  Species 10 of 55 - Species 18 - 10/07/2015 15:40:46
    +##  Species 11 of 55 - Species 19 - 10/07/2015 15:40:46
    +##  Species 12 of 55 - Species 2 - 10/07/2015 15:40:46
    +##  Species 13 of 55 - Species 20 - 10/07/2015 15:40:46
    +##  Species 14 of 55 - Species 21 - 10/07/2015 15:40:46
    +##  Species 15 of 55 - Species 22 - 10/07/2015 15:40:46
    +##  Species 16 of 55 - Species 23 - 10/07/2015 15:40:46
    +##  Species 17 of 55 - Species 24 - 10/07/2015 15:40:46
    +##  Species 18 of 55 - Species 25 - 10/07/2015 15:40:46
    +##  Species 19 of 55 - Species 27 - 10/07/2015 15:40:46
    +##  Species 20 of 55 - Species 28 - 10/07/2015 15:40:46
    +##  Species 21 of 55 - Species 29 - 10/07/2015 15:40:46
    +##  Species 22 of 55 - Species 3 - 10/07/2015 15:40:46
    +##  Species 23 of 55 - Species 30 - 10/07/2015 15:40:46
    +##  Species 24 of 55 - Species 31 - 10/07/2015 15:40:46
    +##  Species 25 of 55 - Species 32 - 10/07/2015 15:40:46
    +##  Species 26 of 55 - Species 33 - 10/07/2015 15:40:46
    +##  Species 27 of 55 - Species 34 - 10/07/2015 15:40:46
    +##  Species 28 of 55 - Species 35 - 10/07/2015 15:40:46
    +##  Species 29 of 55 - Species 36 - 10/07/2015 15:40:46
    +##  Species 30 of 55 - Species 37 - 10/07/2015 15:40:46
    +##  Species 31 of 55 - Species 38 - 10/07/2015 15:40:46
    +##  Species 32 of 55 - Species 39 - 10/07/2015 15:40:46
    +##  Species 33 of 55 - Species 4 - 10/07/2015 15:40:46
    +##  Species 34 of 55 - Species 40 - 10/07/2015 15:40:46
    +##  Species 35 of 55 - Species 41 - 10/07/2015 15:40:46
    +##  Species 36 of 55 - Species 42 - 10/07/2015 15:40:46
    +##  Species 37 of 55 - Species 44 - 10/07/2015 15:40:46
    +##  Species 38 of 55 - Species 45 - 10/07/2015 15:40:46
    +##  Species 39 of 55 - Species 46 - 10/07/2015 15:40:46
    +##  Species 40 of 55 - Species 47 - 10/07/2015 15:40:46
    +##  Species 41 of 55 - Species 48 - 10/07/2015 15:40:46
    +##  Species 42 of 55 - Species 49 - 10/07/2015 15:40:46
    +##  Species 43 of 55 - Species 5 - 10/07/2015 15:40:46
    +##  Species 44 of 55 - Species 50 - 10/07/2015 15:40:46
    +##  Species 45 of 55 - Species 51 - 10/07/2015 15:40:46
    +##  Species 46 of 55 - Species 52 - 10/07/2015 15:40:46
    +##  Species 47 of 55 - Species 54 - 10/07/2015 15:40:46
    +##  Species 48 of 55 - Species 55 - 10/07/2015 15:40:46
    +##  Species 49 of 55 - Species 56 - 10/07/2015 15:40:46
    +##  Species 50 of 55 - Species 57 - 10/07/2015 15:40:46
    +##  Species 51 of 55 - Species 6 - 10/07/2015 15:40:46
    +##  Species 52 of 55 - Species 66 - 10/07/2015 15:40:46
    +##  Species 53 of 55 - Species 7 - 10/07/2015 15:40:46
    +##  Species 54 of 55 - Species 8 - 10/07/2015 15:40:46
    +##  Species 55 of 55 - Species 9 - 10/07/2015 15:40:46
    +## [1] "frescalo complete"
    +

    It is worth noting the console output here. We get a warning telling us that I have some data from a site that is not in my weights file, so I might want to investigate that and add the site to my weights file. We will ignore it for now. The second warning tells us that the sinkdir that we gave already has frescalo output in it. The function has got around this by renaming the output. We finally got a long list of all the species as their data were compiled internally.

    +

    Now for the plotting.

    +
    plot(unicorn_results)
    +
    + + +
    +

    Each panel of the plot gives different information for your results. The top right plot shows the observed number of species at each site (given in unicorn_results$stat$No_spp), this can be contrasted with the top left plot which gives the estimated number of species after accounting for recording effort (given in unicorn_results$stat$Spnum_out). Recording effort is presented in the bottom left panel - low values of alpha (white) show areas of high recording effort (given in unicorn_results$stat$Alpha), and a summary of the species trends are given in the bottom right (given in unicorn_results$lm_stats). In this case there is a skew towards species increasing, however some of these may be non-significant, this could be explored in more detail be referring to unicorn_results$lm_stats.

    +
    +
    + + + + + + + + + diff --git a/vignettes/.gitignore b/vignettes/.gitignore new file mode 100644 index 0000000..097b241 --- /dev/null +++ b/vignettes/.gitignore @@ -0,0 +1,2 @@ +*.html +*.R diff --git a/vignettes/sparta_vignette.md b/vignettes/sparta_vignette.Rmd similarity index 99% rename from vignettes/sparta_vignette.md rename to vignettes/sparta_vignette.Rmd index e57793a..81a2df5 100644 --- a/vignettes/sparta_vignette.md +++ b/vignettes/sparta_vignette.Rmd @@ -1,11 +1,11 @@ -# sparta - Species Presence Absence R Trends Analyses -Tom August -June 2015 - - +--- +title: "sparta - Species Presence Absence R Trends Analyses" +output: rmarkdown::html_vignette +vignette: > + %\VignetteIndexEntry{Overview} + %\VignetteEngine{knitr::rmarkdown} + %\VignetteEncoding{UTF-8} +--- # Introduction diff --git a/vignettes/sparta_vignette.pdf b/vignettes/sparta_vignette.pdf deleted file mode 100644 index 7567628..0000000 Binary files a/vignettes/sparta_vignette.pdf and /dev/null differ