Text data mining: discovery of important keywords in the cyberspace

H. Arimura, J. Abe, R. Fujino, H. Sakamoto, S. Shimozono, S. Arikawa
2000 Proceedings 2000 Kyoto International Conference on Digital Libraries: Research and Practice  
This paper describes applications of the optimized pattern discover), framework to text and Web mining. In particular; we introduce a class of simple combinatorial patterns over phrases, called proximity phrase association patterns, and consider the problem of jinding the patterns that optimize a given statistical measure within the whole class of patterns in a large collection of unstructured texts. For this class of patterns, we develop fast and robust text mining algorithms based on
more » ... s based on techniques in computational geometry and string matching. Finally, we successfully apply the developed text mining algorithms to the experiments on interactive document browsing in a large text database and keyword discovery from Web bases.
doi:10.1109/dlrp.2000.942178 dblp:conf/kyotoDL/ArimuraASAFS00 fatcat:afnknisjfrdfpfojmjsrofnsdu