VAST-Tree

Takeshi Yamamuro, Makoto Onizuka, Toshio Hitaka, Masashi Yamamuro
2012 Proceedings of the 15th International Conference on Extending Database Technology - EDBT '12  
We propose a compact and ecient index structure for massive data sets. Several indexing techniques are widely-used and well-known such as binary trees and B+trees. Unfortunately, we nd that these techniques suer major two shortcomings when applied to massive sets; rst, their indices are so large they could overow regular main memory, and, second, they suer from a variety of penalties (e.g., conditional branches, low cache hits, and TLB misses), which restricts the number of instructions
more » ... per processor cycle. Our state-of-the-art index structure, called VAST-Tree, classies branch nodes into multiple layers. It applies existing techniques such as cache-conscious, aligned, and branchfree structures to the top layers of branch nodes in trees. Next, it applies the adaptive compression technique to save space and harness data parallelism with SIMD instructions to the middle and bottom layers of branch nodes. Moreover, a processor-friendly compression technique is applied to leaf nodes. The end result is that trees are much more compact and traversal eciency is high. We implement a prototype and show its resulting index size and performance as compared to binary trees, and the hardware-conscious technique called FAST which currently oers the highest performance. Compared to current alternatives, VAST-Tree compacts the branch nodes by more than 95%, and the overall index size by 47-84% given that there are 2 30 keys. With 2 28 keys, it has roughly 6.0-times and 1.24-times throughput and saves the memory consumption by more than 94.7% and 40.5% as compared to binary trees and FAST, respectively.
doi:10.1145/2247596.2247643 dblp:conf/edbt/YamamuroOHY12 fatcat:aowshe4vfvhrbae2sctirwbqqq