Construction of supervised and unsupervised learning systems for multilingual text categorization

Chung-Hong Lee, Hsin-Chang Yang
2009 Expert systems with applications  
Due to the availability of a huge amount of textual data from a variety of sources, users of internationally distributed information regions need effective methods and tools that enable them to discover, retrieve and categorize relevant information, in whatever language and form it may have been stored. This drives a convergence of numerous interests from diverse research communities focusing on the issues related to multilingual text categorization. In this work, we implemented and measured
more » ... performance of the leading supervised and unsupervised approaches for multilingual text categorization. We selected support vector machines (SVM) as representative of supervised techniques as well as latent semantic indexing (LSI) and self-organizing maps (SOM) techniques as our selective ones of unsupervised methods for system implementation. The preliminary results show that our platform models including both supervised and unsupervised learning methods have the potentials for multilingual text categorization.
doi:10.1016/j.eswa.2007.12.052 fatcat:rwqnbwz2lra33o4nwvpge6peqi