A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2005; you can also visit the original URL.
The file type is application/pdf
.
Choosing the Right Bigrams for Information Retrieval
[chapter]
2004
Classification, Clustering, and Data Mining Applications
After more than 30 years of research in information retrieval, the dominant paradigm remains the "bag-of-words", in which query terms are considered independent of their coocurrences with each other. Although there has been some work on incorporating phrases or other syntactic information into IR, such attempts have given modest and inconsistent improvements, at best. This paper is a first step at investigating more deeply the question of using bigrams for information retrieval. Our results
doi:10.1007/978-3-642-17103-1_50
fatcat:i5p5uksdk5dyvdrjvh4kp4o2nu