A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2017; you can also visit the original URL.
The file type is application/pdf
.
Latent Dirichlet learning for document summarization
2009
2009 IEEE International Conference on Acoustics, Speech and Signal Processing
Automatic summarization is developed to extract the representative contents or sentences from a large corpus of documents. This paper presents a new hierarchical representation of words, sentences and documents in a corpus, and infers the Dirichlet distributions for latent topics and latent themes in word level and sentence level, respectively. The sentence-based latent Dirichlet allocation (SLDA) is accordingly established for document summarization. Different from the vector space
doi:10.1109/icassp.2009.4959927
dblp:conf/icassp/ChangC09
fatcat:i7iflhvfonbrrcwof5sbj7d6wq