Inverse Document Density: A Smooth Measure for Location-Dependent Term Irregularities

Dennis Thom, Harald Bosch, Thomas Ertl
2012 International Conference on Computational Linguistics  
The advent and recent popularity of location-enabled social media services like Twitter and Foursquare has brought a dataset of immense value to researchers in several domains ranging from theory validation in computational sociology, over market analysis, to situation awareness in disaster management. Many of these applications, however, require evaluating the a priori relevance of trends, topics and terms in given regions of interest. Inspired by the well-known notion of the tf-idf weight
more » ... ined with kernel density methods we present a smooth measure that utilizes large corpora of social media data to facilitate scalable, real-time and highly interactive analysis of geolocated text. We describe the implementation specifics of our measure, which are grounded in aggregation and preprocessing strategies, and we demonstrate its practical usefulness with two case studies within a sophisticated visual analysis system.
dblp:conf/coling/ThomBE12 fatcat:pma4eq4rgzdl5hodollg4bxzsq