A Survey of Emerging Trend Detection in Textual Data Mining [chapter]

April Kontostathis, Leon M. Galitsky, William M. Pottenger, Soma Roy, Daniel J. Phelps
2004 Survey of Text Mining  
In this chapter we describe several systems that detect emerging trends in textual data. Some of the systems are semi-automatic, requiring user input to begin processing, others are fully-automatic, producing output from the input corpus without guidance. For each Emerging Trend Detection (ETD) system we describe components including linguistic and statistical features, learning algorithms, training and test set generation, visualization and evaluation. We also provide a brief overview of
more » ... l commercial products with capabilities for detecting trends in textual data, followed by an industrial viewpoint describing the importance of trend detection tools, and an overview of how such tools are used. This review of the literature indicates that much progress has been made toward automating the process of detecting emerging trends, but there is room for improvement. All of the projects we reviewed rely on a human domain expert to separate the emerging t r e n d s f r o m n o i s e i n t h e s y s t e m . F urthermore, we d i s c o vered that few projects have used formal evaluation methodologies to determine the e ectiveness of the systems being created. Development and use of e ective metrics for evaluation of ETD systems is critical. Work continues on the semi-automatic and fully-automatic systems we are developing at Lehigh University HDD]. In addition to adding formal evaluation components to our systems, we are also researching methods for automatically developing training sets and for
doi:10.1007/978-1-4757-4305-0_9 fatcat:sw2yexyqhna2bjmrtc6cglzr6u