Chapter 7 Autoregressive models

7.1 Principle of ARMAX models

Most of the data recorded by monitoring buildings are time series: sequences taken at successive points and indexed in time. The previous chapter dealt with data that were aggregated with low enough frequency that successive values of the outcome variable could be considered independent from each other, and only dependent on explanatory variables. This aggregation however comes with a significant loss of information, since all dynamic effects are smoothed out. Dynamic models, including time-series models, allow the dependent variable to also be influenced by its own past values.

Autoregressive models are based on the idea that the current value of the series, \(y_t\), can be explained as a function of \(p\) past values, \(y_{t-1}\), \(y_{t-2}\), …, \(y_{t-p}\) where \(p\) determines the number of steps into the past needed to forecast the current value. The simplest time-series models are linear AutoRegressive (AR) models. The AR(p) model is of the form: \[\begin{equation} y_t = \phi_1 y_{t-1} + \phi_2 y_{t_2} + ... + \phi_p y_{t_p} + w_t \tag{7.1} \end{equation}\] where \(w_t \sim N(0, \sigma_w^2)\) is independent identically distributed (iid) noise with mean \(0\) and variance \(\sigma_w^2\). This notation is equivalent to writing: \[\begin{equation} y_t \sim N(\phi_1 y_{t-1} + \phi_2 y_{t_2} + ... , \sigma_w^2) \tag{7.2} \end{equation}\] but the time-series literature usually separates the noise into a dedicated variable \(w_t\) in order to formulate assumptions for it.

AR(2) model: an observation is conditioned on its two previous instances

Figure 7.1: AR(2) model: an observation is conditioned on its two previous instances

AR models can be extended into many forms, including the AutoRegressive (AR) Moving Average (MA) model with eXogeneous (X) variables, or ARMAX. In addition to being related to its own \(p\) previous values AR(p), the output can be predicted by additional inputs (X) and the white noise \(w_t\) can be replaced by a moving average of order \(q\) MA(q): \[\begin{equation} y_t = \sum_{i=1}^p \phi_i y_{t-i} + \sum_{j=0}^q \theta_j w_{t-j} + \sum_{k=1}^K \left[ \sum_{i=0}^p \beta_{k,i} x_{t-i,k} \right] \tag{7.3} \end{equation}\] On the right side of this equation, the second term, MA(q), is a linear combination of the successive values of the white noise. The third term includes the influence of the current value and up to \(p\) previous values of \(K\) explanatory variables. A more general notation could assume a different order \(p\) for each separate input.

ARX and ARMAX models have among others been applied to modelling the heat dynamics of buildings and building components by the collaborative work of the IEA EBC Annex 58 (Madsen et al. (2015)). Issues when applying AR(MA)X-models are not only the selection and validation of the model, but also the extraction of physical information from the model parameters, as each individual parameter lacks a direct physical meaning. An important step in ARX-modelling is to select suitable orders of the input and output polynomials. This can be done by stepwise increasing the model order until most significant autocorrelation and cross correlation are removed.

We refer to the books of Madsen (Madsen (2007)) or Shumway et al. (Shumway and Stoffer (2000)), for a more extensive description of all types of AR models. Further structures include Integrated models for non-stationary data (ARIMA), Seasonal components for longer-term cyclic data (SARMA, SARIMAX), and the autoregressive conditionally heteroscedastic (ARCH) model for non-constant conditional noise variance.

7.2 Example

I am currently working on the use of time series models for a Bayesian forecasting of building energy use. There will be a tutorial here after I get some results. In the meantime, auto-regressive models are also covered in the Stan user’s guide.

References

Madsen, Henrik. 2007. Time Series Analysis. CRC Press.

Madsen, Henrik, Peder Bacher, Geert Bauwens, An-Heleen Deconinck, Glenn Reynders, Staf Roels, Eline Himpe, and Guillaume Lethé. 2015. “Thermal Performance Characterization Using Time Series Data-Iea Ebc Annex 58 Guidelines.”

Shumway, Robert H, and David S Stoffer. 2000. Time Series Analysis and Its Applications. Vol. 3. Springer.