A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2020; you can also visit the original URL.
The file type is application/pdf
.
HiTR: Hierarchical Topic Model Re-estimation for Measuring Topical Diversity of Documents
[article]
2018
arXiv
pre-print
A high degree of topical diversity is often considered to be an important characteristic of interesting text documents. A recent proposal for measuring topical diversity identifies three distributions for assessing the diversity of documents: distributions of words within documents, words within topics, and topics within documents. Topic models play a central role in this approach and, hence, their quality is crucial to the efficacy of measuring topical diversity. The quality of topic models is
arXiv:1810.05436v1
fatcat:y7qhgfr62zfhxl4webmf7nugna