Abstract |
This paper aims to use mixture models to produce predictions from
time series data. Given data of the form (ti;yi), i = 1;...;T, we propose a mixture
model localized at time point tT with the k-th component as yi = mk(ti)+epsilonik
with mixing proportions PIk(ti) such that 0 <= PIk(ti) <= 1 and the Sum of k
=1 to k, PIk(ti) = 1,
where K is the number of components. The mk(.) are smooth unspecified regression
functions, and the errors epsilonik ~ N(0;sigma^2) are independently distributed.
Estimation of this model is achieved through a kernel-weighted version of the
EM-algorithm, using exponential kernels with different bandwidths (neighbourhood
sizes) hk as weight functions. By modelling a mixture of local regressions
at a target time point tT but with different bandwidths hk, the estimated mixture
probabilities are informative for the amount of information available in the
data set at the scale of resolution corresponding to each bandwidth. Nadaraya-
Watson and local linear estimators are used to carry out the localized estimation
step. For prediction at time point tT+1, adequate methods are provided for each local method, and compared to competing forecasting routines. The data under
study give the energy use for Bolivia, Lebanon, and Greece from 1971 to 2011.
1 Introduction
Mixture models play an important role in the statistical analysis of data
thanks to their flexibility to model a wide variety of random phenomena. They
have been successfully employed in marketing and econometrics (Frühwirth-
Schnatter, 2001) as well as biology and epidemiology (Green and Richardson,
2002), to name a few out of a huge number of fields of application.
One useful type of mixture models is the mixture of regression models. Mixtures
of regression models are appropriate to use when the observations are
from several subgroups with missing grouping identities, and in each subgroup,
the response has a linear relationship with one or more other recorded variables.
Many efforts have been made to extend such models as finite mixtures of generalized
linear models which are comprehensively discussed by McLachlan and
Peel (2004). Bayesian approaches for mixture regression models are summarized
by Frühwirth-Schnatter (2006). Mixture models continue to be a topic of
intense research activity, with special issues being edited in close succession
(Böhning et al, 2014; Hinde et al, 2016). A large proportion of articles in those
special issues discusses variants of mixture regression models, such as Poisson
regression, spline regression, or regression under censoring.
Recently, mixtures of nonparametric regression models, which relax the linearity
assumption on the regression functions, have gained particular attention.
For example,Young and Hunter (2010) use kernel regression to model covariatedependent
proportions for mixtures of linear regression models, an idea which
was further developed into a semi-parametric approach by Huang and Yao
(2012). Huang et al (2013) have proposed a nonparametric finite regression
mixture model where the mixing proportions, the mean functions, and the variance
functions are all nonparametric, with application on the U.S. house price
index (HPI) data. However, to our knowledge, there is no statistical method
for prediction from time series based on mixture models and nonparametric
regression. Nonparametric regression is a technique for modelling (possibly
non-linear) trends in data. One approach to nonparametric regression is local
modelling which locally estimates the mean function m(t) using a set of
parametric models. |