# Chapter 6 Finite mixture models

## 6.1 Principle

The energy signature models only offer a coarse disaggregation of energy use into three components: heating, cooling, and baseline consumption. Furthermore, they rely on very long sampling times and cannot predict sub-daily consumption profiles. Finite Mixture Models (FMM) are one way to take the disaggregation of the baseline energy consumption further. Their most common specific case are the Gaussian Mixture Models (GMM).

Finite mixture models assume that the outcome $$y$$ is drawn from one of several distributions, the identity of which is controlled by a categorical mixing distribution. For instance, the mixture of $$K$$ normal distributions $$f$$ with locations $$\mu_k$$ and scales $$\sigma_k$$ reads: $\begin{equation} p(y_i|\lambda, \mu, \sigma) = \sum_{k=1}^K \lambda_k f(y_i|\mu_k,\sigma_k) \tag{6.1} \end{equation}$ where $$\lambda_k$$ is the (positive) mixing proportion of the $$k$$th component and $$\sum_{k=1}^K \lambda_k = 1$$. The FMM distributes the observed values into a finite number of distributions with probability $$\lambda_k$$. The optimal number of components is not always a trivial choice: studies involving GMM often rely on some model selection index, such as the Bayesian Information Criterion (BIC), to guide the choice of the appropriate value for $$K$$.

The dependency of observations $$y$$ on explanatory variables $$x$$ can be included in the FMM, by formulating its parameters $$\left\{ \lambda_k(x), \mu_k(x), \sigma_k(x) \right\}$$ as dependent on the given value $$x$$ of these regressors. Furthermore, in order to include the effects of different power consumption demand behaviours, the mixture probabilities $$\lambda_k$$ can be modelled as dependent on a categorical variable $$z$$. Finite Mixture Models thus offer a very high flexibility for attempting to disaggregate and predict energy uses, while including the possible effects of continuous or discrete explanatory variables.