Skip to content

R package for computing the epsilon-complexity of a time series.

License

Notifications You must be signed in to change notification settings

nateaff/ecomplex

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Build Status

ecomplex

R package for efficiently computing the epsilon-complexity coefficients of a time series. The coefficients are estimates of the complexity of a time series. The complexity coefficients are computed by finding a trend in the best approximation error as the series is successively downsampled and approximated.

An example

The complexity coefficients can be used a feature in classification and clustering tasks. Here we generate two groups of time series, each group drawn from an ARMA(2,2) model with one parameter changed. A plot of the coefficients for each series shows the two groups are fairly well separated in the coefficient space.

library(ecomplex)
set.seed(1)
reps   <- 100; n <- 500
group1 <- replicate(reps, arima.sim(n = n, 
                              list(ar = c(0.89, -0.49), 
                                   ma = c(-0.23, 0.25))))
group2 <- replicate(reps, arima.sim(n = n, 
                              list(ar = c(0.69, -0.49),
                                   ma = c(-0.23, 0.25))))
ecomp1 <- apply(group1, 2, ecomplex)
ecomp2 <- apply(group2, 2, ecomplex)
coeffs <- lapply(c(ecomp1, ecomp2), function(fit) c(fit$A, fit$B))

df     <- data.frame(do.call(rbind, coeffs))
df$id  <- factor(rep(c(1,2), each = reps))
names(df) <- c('A', 'B', 'id')
palette(c('gray20', 'chocolate3'))
with(df, plot(B, A, col = id, lwd = 2))

An example from each time series group plotted on the same axis.

plot(c(group1[1:500,1]), xlim = c(0,1000), 
                         ylab = '', 
                         col = 'gray20', 
                         lwd = 1.2, 
                         type ='l')
lines(501:1000, group2[, 1], col = 'chocolate3', lwd = 1.2)
abline(v = c(500), lwd = 3, col = 'gray20')

palette('default')

The basic algorithm

The ecomplex function successively down samples and approximates a time series. The coefficients are the parameters of a log-log regression of the set of approximation errors on the fraction of sample points retained for each approximation. Roughly, the coefficients measure the amount of information (in terms of sample points) needed to approximate a function within some error epsilon.

For mathematical details see Darkhovsky and Piryatinska, Binary classification of multi-channel EEG records based on the epsilon-complexity of continuous vector functions.

Installation

# install.packages('devtools')
devtools::install_github('nwaff/ecomplex')

Future work

The package is in development but the ecomplex interface should be fairly stable. The ecomplex function computes the error on single variable time series but an option might be added to compute the epsilon-complexity coefficients for a multivariate time series.

The palarm function included in the package is a change point detection algorithm. There will likely be minor changes to return type of the function in the near future.

About

R package for computing the epsilon-complexity of a time series.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages