SIMD-based decoding of posting lists

Alexander A. Stepanov, Anil R. Gangolli, Daniel E. Rose, Ryan J. Ernst, Paramjit S. Oberoi
2011 Proceedings of the 20th ACM international conference on Information and knowledge management - CIKM '11  
Powerful SIMD instructions in modern processors offer an opportunity for greater search performance. In this paper, we apply these instructions to decoding search engine posting lists. We start by exploring variable-length integer encoding formats used to represent postings. We define two properties, byte-oriented and byte-preserving, that characterize many formats of interest. Based on their common structure, we define a taxonomy that classifies encodings along three dimensions, representing
more » ... e way in which data bits are stored and additional bits are used to describe the data. Using this taxonomy, we discover new encoding formats, some of which are particularly amenable to SIMDbased decoding. We present generic SIMD algorithms for decoding these formats. We also extend these algorithms to the most common traditional encoding format. Our experiments demonstrate that SIMD-based decoding algorithms are up to 3 times faster than non-SIMD algorithms.
doi:10.1145/2063576.2063627 dblp:conf/cikm/StepanovGREO11 fatcat:x5fwdllgs5azxcvsdlgw4n6jy4