Di Jiang, Jan Vosecky, Kenneth Wai-Ting Leung, Wilfred Ng
2012 Proceedings of the 21st ACM international conference on Information and knowledge management - CIKM '12  
Search engine query log is an important information source that contains millions of users' interests and information needs. In this paper, we tackle the problem of discovering latent geographic search topics via mining search engine query logs. A novel framework G-WSTD that contains search session derivation, geographic information extraction and geographic search topic discovery is developed to support a variety of downstream web applications. The core components of the framework are two
more » ... models, which discover geographic search topics from two different perspectives. The first one is the Discrete Search Topic Model (DSTM), which aims to capture the semantic commonalities across discrete geographic locations. The second one is the Regional Search Topic Model (RSTM), which focuses on a specific region on the map and discovers web search topics that demonstrate geographic locality. We evaluate our framework against several strong baselines on a real-life query log. The framework demonstrates improved data interpretability, better prediction performance and higher topic distinctiveness in the experimentation. The effectiveness of the framework is also verified by applications such as user profiling and URL annotation.
doi:10.1145/2396761.2398414 dblp:conf/cikm/JiangVLN12 fatcat:6ijhobrxlrf6vi5boc4nbtyi3e