A wrapper estimator that transforms any sklearn regressor into a time series predictor or sequence to sequence mapper. The TSR internally transforms a regular dataset where the rows correspond to terms of a sequence into a sequence prediction dataset and learns a sequence to sequence predictor.
Numpy, Pandas, SciKit-Learn,pickle
To make a predictor of the stock market that maps the previous two days of the s&p500 stock prices and predicts the next day's price of AAPL stock try the following:
from TimeSeriesEstimator import TimeSeriesRegressor, time_series_split
from sklearn.linear_model import LinearRegression,Lasso
from utils import datasets
X = datasets('sp500')
y = X['AAPL']
X_train, X_test = time_series_split(X)
y_train, y_test = time_series_split(y)
n_prev=2
tsr = TimeSeriesRegressor(Lasso(), n_prev=n_prev)
tsr.fit(X_train, y_train)
pred_train = tsr.predict(X_train) #outputs a numpy array of length: len(X_train)-n_prev
pred_test = tsr.predict(X_test)
To forecast all stocks in the s&p500 100 days into the future use the forecast method:
tsr = TimeSeriesRegressor(LinearRegression(), n_prev=2)
tsr.fit(X_train)
fc = tsr.forecast(X_train, 100)
See the ipython notebook for a longer interactive example!
Clone this repo and call directly as a module. Have not added automatic install support yet.
##Mechanics
The TSR works by taking in a single (X) or two datasets (X,Y) of equal length. In the single dataset case, the TSR assumes you would like to predict the next element in the dataset using the previous elements. In either case it forms a dataset by taking the previous n timesteps and flattening them into a vector.
Feature 1 | Feature 2 |
---|---|
1 | 1.5 |
2 | 2.5 |
3 | 3.5 |
4 | 4.5 |
5 | 5.5 |
Feature 1 | Feature 2 | Feature 3 | Feature 4 |
---|---|---|---|
1 | 1.5 | 2 | 2.5 |
2 | 2.5 | 3 | 3.5 |
3 | 3.5 | 4 | 4.5 |
Feature 1 | Feature 2 |
---|---|
3 | 3.5 |
4 | 4.5 |
5 | 5.5 |
Now the X and Y datasets can be fit by any regression technique in sklearn. If the technique cannot handle vectors as outputs, use the "parallel_models" input to predict each feature sequence with its own multi to single dim regressor.