A Multiple Instance Learning Framework for Identifying Key Sentences and Detecting Events

Wei Wang, Yue Ning, Huzefa Rangwala, Naren Ramakrishnan
2016 Proceedings of the 25th ACM International on Conference on Information and Knowledge Management - CIKM '16  
State-of-the-art event encoding approaches rely on sentence or phrase level labeling, which are both time consuming and infeasible to extend to large scale text corpora and emerging domains. Using a multiple instance learning approach, we take advantage of the fact that while labels at the sentence level are difficult to obtain, they are relatively easy to gather at the document level. This enables us to view the problems of event detection and extraction in a unified manner. Using distributed
more » ... epresentations of text, we develop a multiple instance formulation that simultaneously classifies news articles and extracts sentences indicative of events without any engineered features. We evaluate our model in its ability to detect news articles about civil unrest events (from Spanish text) across ten Latin American countries and identify the key sentences pertaining to these events. Our model, trained without annotated sentence labels, yields performance that is competitive with selected state-of-the-art models for event detection and sentence identification. Additionally, qualitative experimental results show that the extracted eventrelated sentences are informative and enhance various downstream applications such as article summarization, visualization, and event encoding.
doi:10.1145/2983323.2983821 dblp:conf/cikm/0064NRR16 fatcat:q64ozniftbbbrmil4ekohj3xya