Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Hi.
After reading #180 and #177, I thought I could give a shot at implementing cross-validation.
I defined common cv procedures
ExpandingWindow()
,SlidingWindow()
,Holdout()
asrolling_window
objects + 2 S3 methods:roll()
is based on theslider
package and apply an arbitrary function (identity
by default) on sub-windows iteratively. It also cuts off the specified horizon. The output is a tibble with the tsibble keys + an extra list-column containing untransformed results.cv()
fits models on folds and return forecasts as a fable. Intermediary folds are not kept because in most cases we are only interested in forecasts to evaluate accuracy. It's also faster and more memory-efficient.roll()
can be used to computefeatures()
on folds if we want to do timeseries classifications for example.The window parameters
.init
,.size
,.step
, and the cutoffh
can be specified in terms of calendar periods, or in terms of the number of observations if.period
is NULL. The implementation relies on thewarp
package.I also implemented an optional parallelism. I confirmed with microbenchmarks that it is more efficient to parallelize on the folds rather than models.
An example of usage:
Dev version of
tibble
breaksforecast()
and thereforecv()
. It's caused by[[<-.tbl_df
:Created on 2020-03-25 by the reprex package (v0.3.0)
I did not write tests but I can work on them if you think my implementation is useful.