Online Clustering of Processes

Azadeh Khaleghi, Daniil Ryabko, Jérémie Mary, Philippe Preux
2012 Journal of machine learning research  
The problem of online clustering is considered in the case where each data point is a sequence generated by a stationary ergodic process. Data arrive in an online fashion so that the sample received at every timestep is either a continuation of some previously received sequence or a new sequence. The dependence between the sequences can be arbitrary. No parametric or independence assumptions are made; the only assumption is that the marginal distribution of each sequence is stationary and
more » ... c. A novel, computationally efficient algorithm is proposed and is shown to be asymptotically consistent (under a natural notion of consistency). The performance of the proposed algorithm is evaluated on simulated data, as well as on real datasets (motion classification).
dblp:journals/jmlr/KhaleghiRMP12 fatcat:rj64msz3o5fldluomdcng2jtj4