The Feasibility of Brute Force Scans for Real-Time Tweet Search

Yulu Wang, Jimmy Lin
2015 Proceedings of the 2015 International Conference on Theory of Information Retrieval - ICTIR '15  
The real-time search problem requires making ingested documents immediately searchable, which presents architectural challenges for systems built around inverted indexing. In this paper, we explore a radical proposition: What if we abandon document inversion and instead adopt an architecture based on brute force scans of document representations? In such a design, "indexing" simply involves appending the parsed representation of an ingested document to an existing buffer, which is simple and
more » ... t. Quite surprisingly, experiments with TREC Microblog test collections show that query evaluation with brute force scans is feasible and performance compares favorably to a traditional search architecture based on an inverted index, especially if we take advantage of vectorized SIMD instructions and multiple cores in modern processor architectures. We believe that such a novel design is worth further exploration by IR researchers and practitioners.
doi:10.1145/2808194.2809489 dblp:conf/ictir/WangL15 fatcat:dtyqpyqczbe5toh53cmlbfpf4y