ON-LINE INFERENCE FOR DATA STREAMS

PETER CLIFFORD
2006 Statistical Problems in Particle Physics, Astrophysics and Cosmology  
Rapid accumulation of substantial datasets is now common in many data processing applications. For example in monitoring and examining Internet traffic; analysing high-frequency financial data in market trading; voice and video capture; data logging in numerous areas of scientific enquiry. Markov Chain Monte Carlo (MCMC) methods revolutionised statistical analysis in the 1990s by providing practical, computationally-feasible access to the flexible and coherent framework of Bayesian inference.
more » ... yesian inference. However, massive datasets have produced difficulties for these methods since, with a few simple exceptions, MCMC implementations require a complete scan of what might be several gigabytes of data at each iteration of the algorithm. For time-series data, progress is possible using modern sequential Monte Carlo methods (known as particle filters). With suitable modifications the techniques can be adapted to deal with more general data catalogues.
doi:10.1142/9781860948985_0048 fatcat:267asupnbve27ewc2zdn57xt7y