Asynchronous view maintenance for VLSD databases

Parag Agrawal, Adam Silberstein, Brian F. Cooper, Utkarsh Srivastava, Raghu Ramakrishnan
2009 Proceedings of the 35th SIGMOD international conference on Management of data - SIGMOD '09  
The query models of the recent generation of very large scale distributed (VLSD) shared-nothing data storage systems, including our own PNUTS and others (e.g. BigTable, Dynamo, Cassandra, etc.) are intentionally simple, focusing on simple lookups and scans and trading query expressiveness for massive scale. Indexes and views can expand the query expressiveness of such systems by materializing more complex access paths and query results. In this paper, we examine mechanisms to implement indexes
more » ... nd views in a massive scale distributed database. For web applications, minimizing update latencies is critical, so we advocate deferring the work of maintaining views and indexes as much as possible. We examine the design space, and conclude that two types of view implementations, called remote view tables (RVTs) and local view tables (LVTs), provide a good tradeoff between system throughput and minimizing view staleness. We describe how to construct and maintain such view tables, and how they can be used to implement indexes, group-by-aggregate views, equijoin views and selection views. We also introduce and analyze a consistency model that makes it easier for application developers to cope with the impact of deferred view maintenance. An empirical evaluation quantifies the maintenance costs of our views, and shows that they can significantly improve the cost of evaluating complex queries.
doi:10.1145/1559845.1559866 dblp:conf/sigmod/AgrawalSCSR09 fatcat:docpqzi6gnbnfb4qnb3exqxtge