Time Series Models

Definition

Models for sequential data where the order of observations matters and temporal dependencies carry predictive information. Covers classical statistical methods (ARIMA, state-space models) and modern neural approaches.

Intuition

A time series is a sequence of observations indexed by time. The key challenge is that observations are not i.i.d.: knowing yesterday’s value tells you something about today’s. The goal is to model this autocorrelation structure to make forecasts.

Formal Description

Stationarity

A time series is weakly stationary if:

  • (constant mean)
  • depends only on lag , not

Most classical methods assume stationarity. Augmented Dickey-Fuller (ADF) test checks for unit roots (non-stationarity).

Differencing to achieve stationarity: . The order of differencing is chosen such that is stationary.

ARIMA

AR() — Autoregressive:

MA() — Moving Average:

ARMA():

ARIMA(): Apply -th differencing to , then fit ARMA().

SARIMA()(): adds seasonal AR, I, MA terms at lag (e.g., for monthly data).

Order selection:

  • ACF (autocorrelation function): significant at lags 1..q → MA(q)
  • PACF (partial autocorrelation): significant at lags 1..p → AR(p)
  • Use AIC/BIC for model selection; auto_arima from pmdarima automates this.

Exponential Smoothing (ETS)

Weighted average of past observations, with exponentially decaying weights:

Simple ES:

Holt-Winters (Triple ES): models level, trend, and seasonality. ETS model:

  • Error type (Additive/Multiplicative)
  • Trend type (None/Additive/Additive-damped)
  • Seasonal type (None/Additive/Multiplicative)

Multiplicative seasonality is appropriate when seasonal fluctuations are proportional to the level.

State-Space Models (Local Linear Trend)

The Kalman filter computes analytically for linear-Gaussian systems.

VAR (Vector Autoregression)

Multivariate extension of AR for time series:

Models cross-series dependencies. Used in macroeconomics; Granger causality tests check whether one series helps predict another.

Neural Time Series Models

LSTM/GRU: sequence-to-sequence models that learn long-range dependencies from data (see recurrent_networks).

Temporal Convolutional Networks (TCN): dilated causal convolutions; can outperform LSTMs with faster training.

Transformer-based: Informer, Autoformer, PatchTST for long-horizon forecasting; attention captures long-range dependencies without sequential processing.

N-BEATS, N-HiTS: pure neural, interpretable forecasting; state of the art on M4 benchmark.

Evaluation Metrics

MetricFormulaProperties
MAE$\frac{1}{T}\sumy_t - \hat{y}_t$Scale-dependent, robust to outliers
RMSEPenalises large errors
MAPE$\frac{100}{T}\sum\frac{y_t - \hat{y}_t}{y_t}$Scale-free; undefined for
MASEMAE normalised by naive in-sample MAEScale-free and meaningful

Applications

  • Insurance: claims reserving (chain-ladder, stochastic development methods)
  • Finance: volatility forecasting (GARCH), algorithmic trading signals
  • Demand forecasting: inventory management, supply chain
  • Anomaly detection: detecting unusual spikes in time series metrics

Trade-offs

  • ARIMA: interpretable, fast, principled; limited to linear dependencies and stationary processes.
  • LSTM/Transformers: capture non-linear temporal patterns; require more data and tuning.
  • Choose classical methods when or explainability is required; neural methods for large-scale, complex patterns.