A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2017; you can also visit the original URL.
The file type is application/pdf
.
Novel Unsupervised Features for Czech Multi-label Document Classification
[chapter]
2014
Lecture Notes in Computer Science
This paper deals with automatic multi-label document classification in the context of a real application for the Czech News Agency. The main goal of this work consists in proposing novel fully unsupervised features based on an unsupervised stemmer, Latent Dirichlet Allocation and semantic spaces (HAL and COALS). The proposed features are integrated into the document classification task. Another interesting contribution is that these two semantic spaces have never been used in the context of
doi:10.1007/978-3-319-13647-9_8
fatcat:heooyb77lnch7bkfherzecelea