Scikit-Learn
¶

  • a sliding-window approach is used to first transform the time series into tabular or panel data
  • fit a tabular or time-series regression estimator
  • estimator
    • scikit-learn models, a tabular regressor
    • sktime models, a time series regressor
  • strategy
    • direct
      • create a separate model for each period
    • recursive
      • a single estimator is fit for a one-step-ahead forecasting horizon and then called iteratively to predict multiple steps ahead
    • multioutput
      • predict the entire time series horizon in a single forecast

Load Data¶

In [1]:
from sktime.datasets import load_airline
from sktime.forecasting.model_selection import temporal_train_test_split

y = load_airline() # 144 for 12 years
y_train, y_test = temporal_train_test_split(y, test_size=36) # hold out last 3 years

Training¶

In [3]:
from sklearn.neighbors import KNeighborsRegressor
from sktime.forecasting.compose import make_reduction
from sktime.performance_metrics.forecasting import mean_absolute_percentage_error
import numpy as np

regressor = KNeighborsRegressor(n_neighbors=1)

forecaster = make_reduction(regressor, window_length=15, strategy="recursive")

forecaster.fit(y_train)
Out[3]:
RecursiveTabularRegressionForecaster(estimator=KNeighborsRegressor(n_neighbors=1),
                                     window_length=15)
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
RecursiveTabularRegressionForecaster(estimator=KNeighborsRegressor(n_neighbors=1),
                                     window_length=15)
KNeighborsRegressor(n_neighbors=1)
KNeighborsRegressor(n_neighbors=1)

Forecasting¶

In [4]:
fh = np.arange(1, 37)
y_pred = forecaster.predict(fh)

Evaluation and Visulization¶

In [6]:
from sktime.utils.plotting import plot_series

plot_series(y_train, y_test, y_pred, labels=["y_train", "y_test", "y_pred"])
mean_absolute_percentage_error(y_test, y_pred, symmetric=False)
Out[6]:
0.12887507224382988