LeadLine: Interactive visual analysis of text data through event identification and exploration

Wenwen Dou, Xiaoyu Wang, Drew Skau, William Ribarsky, Michelle X. Zhou
2012 2012 IEEE Conference on Visual Analytics Science and Technology (VAST)  
Figure 1 : Overview of Leadline. Top right: people and entities related to President Obama (selected) are shown in the graph. Bottom right: locations mentioned in news articles related to the president. Left view: highlighted bursts indicate events that are related to President Obama. ABSTRACT Text data such as online news and microblogs bear valuable insights regarding important events and responses to such events. Events are inherently temporal, evolving over time. Existing visual text
more » ... s systems have provided temporal views of changes based on topical themes extracted from text data. But few have associated topical themes with events that cause the changes. In this paper, we propose an interactive visual analytics system, LeadLine, to automatically identify meaningful events in news and social media data and support exploration of the events. To characterize events, LeadLine integrates topic modeling, event detection, and named entity recognition techniques to automatically extract information regarding the investigative 4 Ws: who, what, when, and where for each event. To further support analysis of the text corpora through events, LeadLine allows users to interactively examine meaningful events using the 4 Ws to develop an understanding of how and why. Through representing large-scale text corpora in the form of meaningful events, LeadLine provides a concise summary of the * wdou1@uncc.edu corpora. LeadLine also supports the construction of simple narratives through the exploration of events. To demonstrate the efficacy of LeadLine in identifying events and supporting exploration, two case studies were conducted using news and social media data.
doi:10.1109/vast.2012.6400485 dblp:conf/ieeevast/DouWSRZ12 fatcat:icluncbxo5fo5d4se4pawagseq