Seed-Guided Deep Document Clustering [chapter]

Mazar Moradi Fard, Thibaut Thonet, Eric Gaussier
2020 Lecture Notes in Computer Science  
Different users may be interested in different clustering views underlying a given collection (e.g., topic and writing style in documents). Enabling them to provide constraints reflecting their needs can then help obtain tailored clustering results. For document clustering, constraints can be provided in the form of seed words, each cluster being characterized by a small set of words. This seed-guided constrained document clustering problem was recently addressed through topic modeling
more » ... s. In this paper, we jointly learn deep representations and bias the clustering results through the seed words, leading to a Seed-guided Deep Document Clustering approach. Its effectiveness is demonstrated on five public datasets.
doi:10.1007/978-3-030-45439-5_1 fatcat:cug7brgy6bdxzcrwynaiarcz6y