Automatic term mismatch diagnosis for selective query expansion

Le Zhao, Jamie Callan
2012 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval - SIGIR '12  
People are seldom aware that their search queries frequently mismatch a majority of the relevant documents. This may not be a big problem for topics with a large and diverse set of relevant documents, but would largely increase the chance of search failure for less popular search needs. We aim to address the mismatch problem by developing accurate and simple queries that require minimal effort to construct. This is achieved by targeting retrieval interventions at the query terms that are likely
more » ... to mismatch relevant documents. For a given topic, the proportion of relevant documents that do not contain a term measures the probability for the term to mismatch relevant documents, or the term mismatch probability. Recent research demonstrates that this probability can be estimated reliably prior to retrieval. Typically, it is used in probabilistic retrieval models to provide query dependent term weights. This paper develops a new use: Automatic diagnosis of term mismatch. A search engine can use the diagnosis to suggest manual query reformulation, guide interactive query expansion, guide automatic query expansion, or motivate other responses. The research described here uses the diagnosis to guide interactive query expansion, and create Boolean conjunctive normal form (CNF) structured queries that selectively expand "problem" query terms while leaving the rest of the query untouched. Experiments with TREC Ad-hoc and Legal Track datasets demonstrate that with high quality manual expansion, this diagnostic approach can reduce user effort by 33%, and produce simple and effective structured queries that surpass their bag of word counterparts. ...$10.00. concept/activity differently. They showed that on average 80-90% of the times, two people will name the same item differently. The best term only covers about 15-35% of all the occurrences of the item, and the 3 best terms together only cover 37-67% of the cases. Even with 15 aliases, only 60-80% coverage is achieved. The authors suggested one solution to be "unlimited aliasing", which led to the Latent Semantic Analysis (LSA) [6] line of research. Zhao and Callan [32] formally defined the term mismatch probability to be ̅ , the likelihood that term t does not appear in a document d, given that d is relevant to the topic (d R), or equivalently, the proportion of relevant documents that do not contain term t. Furnas et al. [7] "s definition of vocabulary mismatch is query independent, and can be reduced to an average case of Zhao and Callan [32]"s query dependent definition. The complement of term mismatch is the term recall probability: P(t | R). A low P(t | R) means term t tends not to appear in the documents relevant to the topic. This query dependent probability P(t | R) is not new in retrieval research. It is known to be part of the Binary Independence Model (BIM) [23], as part of the optimal term weight. Accurate estimation of P(t | R) requires knowledge of R --the relevant set of a topic, which defeats the purpose of retrieval, and P(t | R) was thought to be difficult to estimate. Recent research showed that P(t | R) can be reliably predicted without using relevance information of the test topics [8, 20, 32] . Zhao and Callan [32] achieved the best predictions from being the first to design and use query dependent features for prediction, features such as term centrality, replaceability and abstractness. Previously, P(t | R) predictions were used to adjust query term weights of inverse document frequency (idf)-based retrieval models such as Okapi BM25 and statistical language models. Term weighting is not a new technique in retrieval research, neither is predicting term weights. Our work is a significant departure from the prior research that predicted P(t | R). We apply the P(t | R) predictions in a completely new way, to automatically diagnose term mismatch problems and inform further interventions.
doi:10.1145/2348283.2348354 dblp:conf/sigir/ZhaoC12 fatcat:ksqca423gfaxpdwuawlcbeumpy