The Feasibility of Brute Force Scans for Real-Time Tweet Search
Proceedings of the 2015 International Conference on Theory of Information Retrieval - ICTIR '15
The real-time search problem requires making ingested documents immediately searchable, which presents architectural challenges for systems built around inverted indexing. In this paper, we explore a radical proposition: What if we abandon document inversion and instead adopt an architecture based on brute force scans of document representations? In such a design, "indexing" simply involves appending the parsed representation of an ingested document to an existing buffer, which is simple and
... ch is simple and fast. Quite surprisingly, experiments with TREC Microblog test collections show that query evaluation with brute force scans is feasible and performance compares favorably to a traditional search architecture based on an inverted index, especially if we take advantage of vectorized SIMD instructions and multiple cores in modern processor architectures. We believe that such a novel design is worth further exploration by IR researchers and practitioners.