Fast Lookups for In-Memory Column Stores: Group-Key Indices, Lookup and Maintenance

Martin Faust, David Schwalb, Jens Krüger, Hasso Plattner
2012 Very Large Data Bases Conference  
In-memory column-oriented databases have become a major topic of interest in academia and commercial applications. The demand for analytics on up-to-the-minute data and the availability of systems with hundreds of gigabytes of main memory led to the proposal of combined systems, which provide a single database for operational processing and adhoc analytical queries on current data. Recent research has identified In-Memory Column-Stores as a possible database architecture to meet these
more » ... ts. They are claimed to be capable of delivering the analytical insights while providing sufficient transactional performance. Data therein is typically split up into a write-optimized partition, which gains speed from its small size and tree-structured indices, and a larger read-only partition. To enable fast transactional and analytical performance, an index on the large, read-only partition is advisable in many cases. In this paper we present an index structure for the read-only partition, describe its advantage over the column scan and present an algorithm for the maintenance of the index. The index drastically reduces the memory traffic during query execution, leading to faster lookups and joins, thereby providing benefits to transactional and analytical processing. We analyze the memory traffic of index lookups in comparison with full column scans and the maintenance of the index structure. We develop formulas to determine the viability of an index lookup over a column scan at query runtime. While other research claimed that an index for in-memory systems should just be rebuild after every bulk-load, we show that a substantial performance increase can be achieved by reusing the former index to create an updated index.
dblp:conf/vldb/FaustSKP12 fatcat:vhwncedqgncufgsskorhwqgdxe