A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2011; you can also visit the original URL.
The file type is
Proceedings of the 17th ACM SIGKDD International Conference Tutorials on - KDD '11 Tutorials
Probabilistic topic models are a suite of algorithms whose aim is to discover the hidden thematic structure in large archives of documents. In this article, we review the main ideas of this field, survey the current state-of-the-art, and describe some promising future directions. We first describe latent Dirichlet allocation (LDA)  , which is the simplest kind of topic model. We discuss its connections to probabilistic modeling, and describe two kinds of algorithms for topic discovery. Wedoi:10.1145/2107736.2107741 fatcat:patt4fqxrba35pxono3sbeo2j4