Mohamed Abdel Maksoud, Gaurav Pandey, Shuaiqiang Wang
2017 Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval - SIGIR '17  
We introduce CitySearcher, a vertical search engine that searches for cities when queried for an interest. Generally in search engines, utilization of semantics between words is favorable for performance improvement. Even though ambiguous query words have multiple semantic meanings, search engines can return diversified results to satisfy different users' information needs. But for CitySearcher, mismatched semantic relationships can lead to extremely unsatisfactory results. For example, the
more » ... Sale would incorrectly rank high for the interest shopping because of semantic interpretations of the words. Thus in our system, the main challenge is to eliminate the mismatched semantic relationships resulting from the side effect of the semantic models. In the previous case, we aim to ignore the semantics of a city's name which is not indicative of the city's characteristics. In CitySearcher, we use word2vec, a very popular word embedding technique to estimate the semantics of the words and create the initial ranks of the cities. To reduce the effect of the mismatched semantic relationships, we generate a set of features for learning based on a novel clustering-based method. With the generated features, we then utilize learning to rank algorithms to rerank the cities for return. We use the English version of Wikivoyage dataset for evaluation of our system, where we sample a very small dataset for training. Experimental results demonstrate the performance gain of our system over various standard retrieval techniques.
doi:10.1145/3077136.3080742 dblp:conf/sigir/MaksoudPW17 fatcat:h4w2ynzbubfw7enr66s2zew5ym