Phrase recognition and expansion for short, precision-biased queries based on a query log

Erika F. de Lima, Jan O. Pedersen
1999 Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '99  
In this paper we examine the question of query parsing for World Wide Web queries and present a novel method for phrase recognition and expansion. Given a training corpus of approximately 16 million Web queries and a handwritten context-free grammar, the EM algorithm is used to estimate the parameters of a probabilistic context-free grammar PCFG with a system developed by Carroll 5 . We use the PCFG to compute the most probable parse for a user query, re ecting linguistic structure and word
more » ... e of the domain being parsed. The optimal syntactic parse for a user query thus obtained is employed for phrase recognition and expansion. Phrase recognition is used to increase retrieval precision; phrase expansion is applied to make the best use possible of very short Web queries.
doi:10.1145/312624.312669 dblp:conf/sigir/LimaP99 fatcat:e6t6vrspeneord7xcm7luoicvq