Authorship Attribution with Author-aware Topic Models

Yanir Seroussi, Fabian Bohnert, Ingrid Zukerman
2012 Annual Meeting of the Association for Computational Linguistics  
Authorship attribution deals with identifying the authors of anonymous texts. Building on our earlier finding that the Latent Dirichlet Allocation (LDA) topic model can be used to improve authorship attribution accuracy, we show that employing a previously-suggested Author-Topic (AT) model outperforms LDA when applied to scenarios with many authors. In addition, we define a model that combines LDA and AT by representing authors and documents over two disjoint topic sets, and show that our model
more » ... outperforms LDA, AT and support vector machines on datasets with many authors.
dblp:conf/acl/SeroussiBZ12 fatcat:yoje3dngzrgntje5g2at2kiy5m