Multifaceted toponym recognition for streaming news

Michael D. Lieberman, Hanan Samet
2011 Proceedings of the 34th international ACM SIGIR conference on Research and development in Information - SIGIR '11  
News sources on the Web generate constant streams of information, describing many aspects of the events that shape our world. In particular, geography plays a key role in the news, and enabling geographic retrieval of news articles involves recognizing the textual references to geographic locations (called toponyms) present in the articles, which can be difficult due to ambiguity in natural language. Toponym recognition in news is often accomplished with algorithms designed and tested around
more » ... ll corpora of news articles, but these static collections do not reflect the streaming nature of online news, as evidenced by poor performance in tests. In contrast, a method for toponym recognition is presented that is tuned for streaming news by leveraging a wide variety of recognition components, both rule-based and statistical. An evaluation of this method shows that it outperforms two prominent toponym recognition systems when tested on large datasets of streaming news, indicating its suitability for this domain.
doi:10.1145/2009916.2010029 dblp:conf/sigir/LiebermanS11 fatcat:c54odfqjfvdxvcf23lvmmpbsi4