A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2017; you can also visit the original URL.
The file type is application/pdf
.
Document Representation with Statistical Word Senses in Cross-Lingual Document Clustering
2015
International journal of pattern recognition and artificial intelligence
Cross-lingual document clustering is the task of automatically organizing a large collection of multi-lingual documents into a few clusters, depending on their content or topic. It is well known that language barrier and translation ambiguity are two challenging issues for cross-lingual document representation. To this end, we propose to represent cross-lingual documents through statistical word senses, which are automatically discovered from a parallel corpus through a novel cross-lingual word
doi:10.1142/s021800141559003x
fatcat:eevqzoolcjhlxex523jjm2q74m