Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Select Base year for Collated index #7

Open
RetoSchmucki opened this issue Nov 14, 2020 · 3 comments
Open

Select Base year for Collated index #7

RetoSchmucki opened this issue Nov 14, 2020 · 3 comments

Comments

@RetoSchmucki
Copy link
Owner

Here is a solution for selecting the reference year to report trends, instead of using the mean of the series.

After bootstrap sampling in Vignette 2....

of transect monitored

co_index_b <- co_index[COL_INDEX > 0.0001 & COL_INDEX < 100000, ]
co_index_logInd <- co_index_b[BOOTi == 0, .(M_YEAR, COL_INDEX)][, .(logInd = log(COL_INDEX) / log(10)), by = M_YEAR][, mean_logInd := mean(logInd)]

set base year

base_year <- 2012
co_index_logInd$base_year_logInd <- co_index_logInd[M_YEAR == base_year, logInd]

merge the mean log index with the full bootstrap dataset

setkey(co_index_logInd, M_YEAR); setkey(co_index_b, M_YEAR)
co_index_b <- merge(co_index_b, co_index_logInd, all.x = TRUE)

compute the log collated index for each bootstap samples

setkey(co_index_b, BOOTi, M_YEAR)
co_index_b[ , boot_logInd := log(COL_INDEX)/log(10)]

compute the metric used for the graph of the Collated Log-Index with base year (observed, bootstap sampel, credible confidence interval, linear trend)

b1 <- data.table(M_YEAR = co_index_b$M_YEAR, LCI = 2 + co_index_b$boot_logInd - co_index_b$base_year_logInd)
b2 <- data.table(M_YEAR = co_index_b[BOOTi == 0, M_YEAR], LCI = 2 + co_index_b[BOOTi == 0, logInd] - co_index_b[BOOTi == 0, base_year_logInd])
b5 <- b1[co_index_b$BOOTi != 0, quantile(LCI, 0.025, na.rm = TRUE), by = M_YEAR]
b6 <- b1[co_index_b$BOOTi != 0, quantile(LCI, 0.975, na.rm = TRUE), by = M_YEAR]

output data

LCI_1 <- merge(b2, b5,  by = "M_YEAR")
LCI_out <- merge(LCI_1, b6, by = "M_YEAR")
LCI_out$SPECIES <- p_Speciess
colnames(LCI_out) <- c("YEAR", "LCI", "Q.025", "Q.975", "SPECIES")
@SarahVray
Copy link

Hi @RetoSchmucki, that's interesting, thanks! Could you let me know what would be the benefit of selecting a reference year instead of the mean of the time-series? I see what it would change on the collated indices graph but what would be the impact on the calculated trends? Thanks

@SarahVray
Copy link

SarahVray commented Jan 4, 2024

Hi @RetoSchmucki, linked to this question, in the European Grassland Butterfly Indicator, the log10 species collated indices were standardized to a value of 2 for the first year and not for the time-series average, and so the indicator is set to 100 for the first year.
However, in the script from the workshop (https://butterfly-monitoring.github.io/bms_workshop/multispecies_indicators.html), if I understand well the log10 species collated indices are standardized to a value of 2 for the time-series average with this line:
co_index[COL_INDEX > 0, LOGDENSITY:= log(COL_INDEX)/log(10)][, TRMOBS := LOGDENSITY - mean(LOGDENSITY) + 2, by = .(BOOTi)]
Is that correct?
But then in the following lines of the code, the indicator is re-scaled such that the smoothed indicator starts at 100 for the 1st year as in the European Grassland Butterfly Indicator, so here it was not scaled on the time-series average, am I right? Is it an issue to proceed like this? Is it OK to scale on the time-series average for the species trends and on the 1st year for the indicator trend, or would it be better to always use the 1st year as reference year in both computations? Thanks in advance for your answer.
Wishing you a happy new year!

@RetoSchmucki
Copy link
Owner Author

Hi @SarahVray, sorry for not replying to your requests earlier. You are correct that the code above standardises to the time series average. For indicators, however, it is convention to set the first year to 100, so it reads as change relative to that year. Although this is the convention, it is not always the best way to assess change (e.g. if your first year is poorly informed or biased).

For the workshop code, I need to check the sequence of steps that Emily used and how the conversion was done, but the conversion from the scale 2 to 100 should keep the shape of the relationship (trend). "Would it be better to use the same year of reference for both computations?" - It should not change anything if the transformation between scales is done properly, but YES it would be easier to read and understand the code.

Choosing a reference year is not trivial, and one must consider whether that year is representative and well-informed to be used as a baseline. In many cases (BMS), the first year of activity does not provide the most accurate baseline, although it does provide the longest time series.

Happy New Year!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants