Nonparametric Bayesian Storyline Detection from Microtexts [article]

Vinodh Krishnan, Jacob Eisenstein
2016 arXiv   pre-print
News events and social media are composed of evolving storylines, which capture public attention for a limited period of time. Identifying storylines requires integrating temporal and linguistic information, and prior work takes a largely heuristic approach. We present a novel online non-parametric Bayesian framework for storyline detection, using the distance-dependent Chinese Restaurant Process (dd-CRP). To ensure efficient linear-time inference, we employ a fixed-lag Gibbs sampling
more » ... which is novel for the dd-CRP. We evaluate on the TREC Twitter Timeline Generation (TTG), obtaining encouraging results: despite using a weak baseline retrieval model, the dd-CRP story clustering method is competitive with the best entries in the 2014 TTG task.
arXiv:1601.04580v2 fatcat:bdoucrzahncijpbb3w5t2usjbq