An intelligent information system for organizing online text documents

Han-joon Kim, Sang-goo Lee
2004 Knowledge and Information Systems  
This paper describes an intelligent information system for effectively managing huge amounts of online text documents (such as Web documents) in a hierarchical manner. The organizational capabilities of this system are able to evolve semi-automatically with minimal human input. The system starts with an initial taxonomy in which documents are automatically categorized, and then evolves so as to provide a good indexing service as the document collection grows or its usage changes. To this end,
more » ... propose a series of algorithms that utilize text-mining technologies such as document clustering, document categorization, and hierarchy reorganization. In particular, clustering and categorization algorithms have been intensively studied in order to provide evolving facilities for hierarchical structures and categorization criteria. Through experiments using the Reuters-21578 document collection, we evaluate the performance of the proposed clustering and categorization methods by comparing them to those of well-known conventional methods.
doi:10.1007/bf02637152 fatcat:cqrxxmveqvenbfmt3jfx53jggm